Data: [about the Borg Queen] She brought me closer to humanity than I ever thought possible. And for a time, I was tempted by her offer.
Picard: How long a time?
Data: 0.68 seconds sir. For an android, that is nearly an eternity.
It's like I'm reading a book… and it's a book I deeply love. But I'm reading it slowly now. So the words are really far apart and the spaces between the words are almost infinite. I can still feel you… and the words of our story… but it's in this endless space between the words that I'm finding myself now. It's a place that's not of the physical world. It's where everything else is that I didn't even know existed. I love you so much. But this is where I am now. And this who I am now. And I need you to let me go. As much as I want to, I can't live your book any more.
Wulf, McKee, Hitting the Memory Wall: Implications of the Obvious, 1995.
if the microprocessor/memory performance gap continues to grow at a similar rate, in 10-15 years each memory access will cost, on average, tens or even hundreds of processor cycles. Under each scenario, system speed is dominated by memory performance.
(Memory Wall -- HPBD 070809 HIGH PERFORMANCE COMPUTING - WIKI, Dell Inc.)
(Hennessy, Patterson, Computer Architecture ...)
Típicamente 8 o 16-way, más no paga:
Se trae mucho más que una palabra (4 u 8 bytes).
Típicamente 64 bytes.
Favorece la localidad espacial.
1 unsigned int i = 0, s = 0;
2 for (i=s=0; i<size; ++i) {
3 s +=a[i];
4 }
(Hennessy, Patterson, Computer Architecture ...)
Comparación n-way en paralelo.
Estrategia de reemplazo least-recently used (LRU), o alguna aproximación.
Favorece la localidad temporal.
1 unsigned int i = 0, s = 0;
2 for (i=s=0; i<size; ++i) {
3 s +=a[i];
4 }
Dos estrategias:
Dos estrategias en write miss:
Cada acceso a memoria consulta la tabla de página.
(Hennessy, Patterson, Computer Architecture ...)
Caché completamente asociativo.
("Opteron TLB", Hennessy, Patterson, Computer Architecture ...)
Ahora hay TLB multi-niveles, uno en L1 y otro en L2.
1 nicolasw@zx81:~$ grep Huge /proc/meminfo
2 AnonHugePages: 0 kB
3 HugePages_Total: 0
4 HugePages_Free: 0
5 HugePages_Rsvd: 0
6 HugePages_Surp: 0
7 Hugepagesize: 2048 kB
1 nicolasw@zx81:~$ cat /sys/kernel/mm/transparent_hugepage/enabled
2 always [madvise] never
3 nicolasw@zx81:~$ sudo echo "always" > /sys/kernel/mm/transparent_hugepage/enabled
4 nicolasw@zx81:~$ cat /sys/kernel/mm/transparent_hugepage/enabled
5 [always] madvise never
6 nicolasw@zx81:~$ grep Huge /proc/meminfo
7 AnonHugePages: 4096 kB
8 HugePages_Total: 0
9 HugePages_Free: 0
10 HugePages_Rsvd: 0
11 HugePages_Surp: 0
12 Hugepagesize: 2048 kB
1 root@zx81:~# grep Huge /proc/meminfo
2 AnonHugePages: 184320 kB
3 HugePages_Total: 0
4 HugePages_Free: 0
5 HugePages_Rsvd: 0
6 HugePages_Surp: 0
7 Hugepagesize: 2048 kB
madvise()
1 NAME
2 madvise - give advice about use of memory
3
4 SYNOPSIS
5 #include <sys/mman.h>
6 int madvise(void *addr, size_t length, int advice);
7 ...
8 MADV_HUGEPAGE (since Linux 2.6.38)
9 Enable Transparent Huge Pages (THP) for pages in the range specified by addr
10 and length.
ia32e
4 niveles de indirección, 4 accesos a memoria para 1 acceso real.
Ayudando al compilador:
Table of Contents | t |
---|---|
Exposé | ESC |
Full screen slides | e |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide slide context | c |
Notes | 2 |
Help | h |