TECHNOTES.txt

   1 APC Quick-Start Braindump
   2
   3 This is a rapidly written braindump of how APC currently works in the
   4 form of a quick-start guide to start hacking on APC.
   5
   6 1. Install and use APC a bit so you know what it does from the end-user's
   7    perspective.
   8    user-space functions are all explained here:
   9
  10 2. Grab the current APC code from CVS:
  11
  12     cvs -d:pserver:cvsread@cvs.php.net:/repository login
  13     Password: phpfi
  14     cvs -d:pserver:cvsread@cvs.php.net:/repository co pecl/apc
  15
  16    apc/php_apc.c has most of the code for the user-visible stuff.  It is
  17    also a regular PHP extension in the sense that there are MINIT, MINFO,
  18    MSHUTDOWN, RSHUTDOWN, etc. functions.
  19
  20 3. Build it.
  21
  22    cd pecl/apc
  23    phpize
  24    ./configure --enable-apc --enable-mmap
  25    make
  26    cp modules/apc.so /usr/local/lib/php
  27    apachectl restart
  28
  29 4. Debugging Hints
  30
  31      apachectl stop
  32      gdb /usr/bin/httpd
  33      break ??
  34      run -X
  35
  36    Grab the .gdbinit from the PHP source tree and have a look at the macros.
  37
  38 5. Look through apc/apc_sma.c
  39    It is a pretty standard memory allocator.
  40
  41    apc_sma_malloc, apc_sma_realloc, apc_sma_strdup and apc_sma_free behave to the
  42    caller just like malloc, realloc, strdup and free
  43
  44    On server startup the MINIT hook in php_apc.c calls apc_module_init() in
  45    apc_main.c which in turn calls apc_sma_init().  apc_sma_init calls into
  46    apc_mmap.c to mmap the specified sized segment (I tend to just use a single
  47    segment).  apc_mmap.c should be self-explanatory.  It mmaps a temp file and
  48    then unlinks that file right after the mmap to provide automatic shared memory
  49    cleanup in case the process dies.
  50
  51    Once the region has been initialized we stick a header_t at the beginning
  52    of the region.  It contains the total size in header->segsize and the number
  53    of bytes available in header->avail.
  54
  55    After the header comes a bit of a hack.  A zero-sized block is inserted just
  56    to make things easier later on.  And then a huge block that is basically
  57    the size of the entire segment minus the two (for the 0-sized block, and this one)
  58    block headers.
  59
  60    The code for this is:
  61
  62      header = (header_t*) shmaddr;
  63      header->segsize = sma_segsize;
  64      header->avail = sma_segsize - sizeof(header_t) - sizeof(block_t) - alignword(sizeof(int));
  65      memset(&header->lock,0,sizeof(header->lock));
  66      sma_lock = &header->lock;
  67      block = BLOCKAT(sizeof(header_t));
  68      block->size = 0;
  69      block->next = sizeof(header_t) + sizeof(block_t);
  70      block = BLOCKAT(block->next);
  71      block->size = header->avail;
  72      block->next = 0;
  73
  74    So the shared memory looks like this:
  75
  76      +--------+-------+---------------------------------+
  77      | header | block |             block               |
  78      +--------+-------+---------------------------------+
  79
  80    sma_shmaddrs[0] gives you the address of header
  81
  82    The blocks are just a simple offset-based linked list (so no pointers):
  83
  84      typedef struct block_t block_t;
  85      struct block_t {
  86          size_t size;       /* size of this block */
  87          size_t next;       /* offset in segment of next free block */
  88          size_t canary;     /* canary to check for memory overwrites */
  89 #ifdef __APC_SMA_DEBUG__
  90          int id;         /* identifier for the memory block */
  91 #endif
  92      };
  93
  94    The BLOCKAT macro turns an offset into an actual address for you:
  95
  96      #define BLOCKAT(offset) ((block_t*)((char *)shmaddr + offset))
  97
  98    where shmaddr = sma_shaddrs[0]
  99
 100    And the OFFSET macro goes the other way:
 101
 102      #define OFFSET(block) ((int)(((char*)block) - (char*)shmaddr))
 103
 104    Allocating a block with a call to apc_sma_allocate() walks through the
 105    linked list of blocks until it finds one that is >= to the requested size.
 106    The first call to apc_sma_allocate() will hit the second block.  We then
 107    chop up that block so it looks like this:
 108
 109      +--------+-------+-------+-------------------------+
 110      | header | block | block |         block           |
 111      +--------+-------+-------+-------------------------+
 112
 113    Then we unlink that block from the linked list so it won't show up
 114    as an available block on the next allocate.  So we actually have:
 115
 116      +--------+-------+       +-------------------------+
 117      | header | block |------>|         block           |
 118      +--------+-------+       +-------------------------+
 119
 120    And header->avail along with block->size of the remaining large
 121    block are updated accordingly.  The arrow there representing the
 122    link which now points to a block with an offset further along in
 123    the segment.
 124
 125    When the block is freed using apc_sma_deallocate() the steps are
 126    basically just reversed.  The block is put back and then the deallocate
 127    code looks at the block before and after to see if the block immediately
 128    before and after are free and if so the blocks are combined.  So you never
 129    have 2 free blocks next to each other, apart from at the front with that
 130    0-sized dummy block.  This mostly prevents fragmentation.  I have been
 131    toying with the idea of always allocating block at 2^n boundaries to make
 132    it more likely that they will be re-used to cut down on fragmentation further.
 133    That's what the POWER_OF_TWO_BLOCKSIZE you see in apc_sma.c is all about.
 134
 135    Of course, anytime we fiddle with our shared memory segment we lock using
 136    the locking macros, LOCK() and UNLOCK().
 137
 138    That should mostly take care of the low-level shared memory handling.
 139
 140 6. Next up is apc_main.c and apc_cache.c which implement the meat of the
 141    cache logic.
 142
 143    The apc_main.c file mostly calls functions in apc_sma.c to allocate memory
 144    and apc_cache.c for actual cache manipulation.
 145
 146    After the shared memory segment is created and the caches are initialized,
 147    apc_module_init() installs the my_compile_file() function overriding Zend's
 148    version.  I'll talk about my_compile_file() and the rest of apc_compile.c
 149    in the next section.  For now I will stick with apc_main.c and apc_cache.c
 150    and talk about the actual caches.  A cache consists of a block of shared
 151    memory returned by apc_sma_allocate() via apc_sma_malloc().  You will
 152    notice references to apc_emalloc().  apc_emalloc() is just a thin wrapper
 153    around PHP's own emalloc() function which allocates per-process memory from
 154    PHP's pool-based memory allocator.  Don't confuse apc_emalloc() and
 155    apc_sma_malloc() as the first is per-process and the second is shared memory.
 156
 157    The cache is stored in/described by this struct allocated locally using
 158    emalloc():
 159
 160      struct apc_cache_t {
 161          void* shmaddr;              /* process (local) address of shared cache */
 162          header_t* header;           /* cache header (stored in SHM) */
 163          slot_t** slots;             /* array of cache slots (stored in SHM) */
 164          int num_slots;              /* number of slots in cache */
 165          int gc_ttl;                 /* maximum time on GC list for a slot */
 166          int ttl;                    /* if slot is needed and entry's access time is older than this ttl, remove it */
 167      };
 168
 169    Whenever you see functions that take a 'cache' argument, this is what they
 170    take.  And apc_cache_create() returns a pointer to this populated struct.
 171
 172    At the beginning of the cache we have a header.  Remember, we are down a level now
 173    from the sma stuff.  The sma stuff is the low-level shared-memory allocator which
 174    has its own header which is completely separate and invisible to apc_cache.c.
 175    As far as apc_cache.c is concerned the block of memory it is working with could
 176    have come from a call to malloc().
 177
 178    The header looks like this:
 179
 180      typedef struct header_t header_t;
 181      struct header_t {
 182          int num_hits;               /* total successful hits in cache */
 183          int num_misses;             /* total unsuccessful hits in cache */
 184          slot_t* deleted_list;       /* linked list of to-be-deleted slots */
 185      };
 186
 187    Since this is at the start of the shared memory segment, these values are accessible
 188    across all the yapache processes and hence access to them has to be locked.
 189
 190    After the header we have an array of slots.  The number of slots is user-defined
 191    through the apc.num_slots ini hint.  Each slot is described by:
 192
 193      typedef struct slot_t slot_t;
 194      struct slot_t {
 195          apc_cache_key_t key;        /* slot key */
 196          apc_cache_entry_t* value;   /* slot value */
 197          slot_t* next;               /* next slot in linked list */
 198          int num_hits;               /* number of hits to this bucket */
 199          time_t creation_time;       /* time slot was initialized */
 200          time_t deletion_time;       /* time slot was removed from cache */
 201          time_t access_time;         /* time slot was last accessed */
 202      };
 203
 204    The slot_t *next there is a linked list to other slots that happened to hash to the
 205    same array position.
 206
 207    apc_cache_insert() shows what happens on a new cache insert.
 208
 209      slot = &cache->slots[hash(key) % cache->num_slots];
 210
 211    cache->slots is our array of slots in the segment.  hash() is simply:
 212
 213      static unsigned int hash(apc_cache_key_t key)
 214      {
 215          return key.data.file.device + key.data.file.inode;
 216      }
 217
 218    That is, we use the file's device and inode to uniquely identify it.  Initially
 219    we had used the file's full path, but getting that requires a realpath() call which
 220    is amazingly expensive since it has to stat each component of the path to resolve
 221    symlinks and get rid of relative path components.  By using the device+inode we
 222    can uniquely identify a file with a single stat.
 223
 224    So, on an insert we find the array position in the slots array by hasing the device+inode.
 225    If there are currently no other slots there, we just create the slot and stick it into
 226    the array:
 227
 228      *slot = make_slot(key, value, *slot, t)
 229
 230    If there are other slots already at this position we walk the link list to get to
 231    the end.  Here is the loop:
 232
 233      while (*slot) {
 234          if (key_equals((*slot)->key.data.file, key.data.file)) {
 235              /* If existing slot for the same device+inode is different, remove it and insert the new version */
 236              if ((*slot)->key.mtime != key.mtime) {
 237                  remove_slot(cache, slot);
 238                  break;
 239              }
 240              UNLOCK(cache);
 241              return 0;
 242          } else if(cache->ttl && (*slot)->access_time < (t - cache->ttl)) {
 243              remove_slot(cache, slot);
 244              continue;
 245          }
 246          slot = &(*slot)->next;
 247      }
 248
 249    That first key_equals() check sees if we have an exact match meaning the file
 250    is already in the cache.  Since we try to find the file in the cache before doing
 251    an insert, this will generally only happen if another process managed to beat us
 252    to inserting it.  If we have a newer version of the file at this point we remove
 253    it an insert the new version.  If our version is not newer we just return without
 254    doing anything.
 255
 256    While walking the linked list we also check to see if the cache has a TTL defined.
 257    If while walking the linked list we see a slot that has expired, we remove it
 258    since we are right there looking at it.  This is the only place we remove stale
 259    entries unless the shared memory segment fills up and we force a full expunge via
 260    apc_cache_expunge().  apc_cache_expunge() walks the entire slots array and walks
 261    down every linked list removing stale slots to free up room.  This is obviously
 262    slow and thus only happens when we have run out of room.
 263
 264    apc_cache_find() simply hashes and returns the entry if it is there.  If it is there
 265    but older than the mtime in the entry we are looking for, we delete the one that is
 266    there and return indicating we didn't find it.
 267
 268    Next we need to understand what an actual cache entry looks like.  Have a look at
 269    apc_cache.h for the structs.  I sort of glossed over the key part earlier saying
 270    that we just used the device+inode to find a hash slot.  It is actually a bit more
 271    complex than that because we have two kinds of caches.  We have the standard file
 272    cache containing opcode arrays, but we also have a user-controlled cache that the
 273    user can insert whatever they want into via apc_store().  For the user cache we
 274    obviously don't have a device+inode.  The actual identifier is provided by the user
 275    as a char *.  So the key is actually a union that looks like this:
 276
 277      typedef union _apc_cache_key_data_t {
 278          struct {
 279              int device;             /* the filesystem device */
 280              int inode;              /* the filesystem inode */
 281          } file;
 282          struct {
 283              char *identifier;
 284          } user;
 285      } apc_cache_key_data_t;
 286
 287      struct apc_cache_key_t {
 288          apc_cache_key_data_t data;
 289          int mtime;                  /* the mtime of this cached entry */
 290      };
 291
 292    And we have two sets of functions to do inserts and finds.  apc_cache_user_find()
 293    and apc_cache_user_insert() operate on the user cache.
 294
 295    Ok, on to the actual cache entry.  Again, because we have two kinds of caches, we
 296    also have the corresponding two kinds of cache entries described by this union:
 297
 298      typedef union _apc_cache_entry_value_t {
 299          struct {
 300              char *filename;             /* absolute path to source file */
 301              zend_op_array* op_array;    /* op_array allocated in shared memory */
 302              apc_function_t* functions;  /* array of apc_function_t's */
 303              apc_class_t* classes;       /* array of apc_class_t's */
 304          } file;
 305          struct {
 306              char *info;
 307              zval *val;
 308              unsigned int ttl;
 309          } user;
 310      } apc_cache_entry_value_t;
 311
 312    And then the actual cache entry:
 313
 314      struct apc_cache_entry_t {
 315          apc_cache_entry_value_t data;
 316          unsigned char type;
 317          int ref_count;
 318      };
 319
 320    The user entry is pretty simple and not all that important for now.  I will
 321    concentrate on the file entries since that is what holds the actual compiled
 322    opcode arrays along with the functions and classes required by the executor.
 323
 324    apc_cache_make_file_entry() in apc_cache.c shows how an entry is constructed.
 325    The main thing to understand here is that we need more than just the opcode
 326    array, we also need the functions and classes created by the compiler when it
 327    created the opcode array.  As far as the executor is concerned, it doesn't know
 328    that it isn't operating in normal mode being called right after the parse/compile
 329    phase, so we need to recreate everything so it looks exactly like it would at
 330    that point.
 331
 332 7. my_compile_file() and apc_compile.c
 333
 334    my_compile_file() in apc_main.c controls where we get the opcodes from.  If
 335    the user-specified filters exclude the file from being cached, then we just
 336    call the original compile function and return.  Otherwise we fetch the request
 337    time from Apache to avoid an extra syscall, create the key so we can look up
 338    the file in the cache.  If we find it we stick it on a local stack which we
 339    use at cleanup time to make sure we return everything back to normal after a
 340    request and call cached_compile() which installs the functions and classes
 341    associated with the op_array in this entry and then copy the op_array down
 342    into our memory space for execution.
 343
 344    If we didn't find the file in the cache, we need to compile it and insert it.
 345    To compile it we simply call the original compile function:
 346
 347       op_array = old_compile_file(h, type TSRMLS_CC);
 348
 349    To do the insert we need to copy the functions, classes and the opcode array
 350    the compile phase created into shared memory.  This all happens in apc_compile.c
 351    in the apc_copy_op_array(), apc_copy_new_functions() and apc_copy_new_classes()
 352    functions.  Then we make the file entry and do the insert.  Both of these
 353    operations were described in the previous section.
 354
 355 8. The Optimizer
 356
 357    The optimizer has been deprecated.
 358
 359 If you made it to the end of this, you should have a pretty good idea of where things are in
 360 the code.  I skimmed over a lot of things, so plan on spending some time reading through the code.
 361