[Python-Dev] Re: new bytecode results
Damien Morton
newsgroups1@bitfurnace.com
Sun, 2 Mar 2003 20:55:57 -0500
Sun, 2 Mar 2003 20:55:57 -0500
- Previous message: [Python-Dev] Re: new bytecode results
- Next message: [Python-Dev] python-dev from 2003-02-16 through 2003-02-28
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I optimised the layout of the python opcodes using a simulated annealing process that scored adjacent opcodes according to their frequency of co-occurence. This raised my PyStone benchmark from 22100 to 22700, for a 3% gain. Ive been using Skip's DXP server to gather statistics, but there isnt much data there. I should be able to achieve better results if more people contributed stats to his server, more information about which can be found here: http://manatee.mojam.com/~skip/python/ The process of layout the opcodes and switch cases has largely been automated, and generating new layouts is relatively painless and quick. Do please contribute stats for 2.3a2 to Skip's DXP server. I also implemented a LOAD_FASTER opcode, with the argument encoded into the opcode. This raised my PyStone benchmark from 22700 to 23150, for a total 5% gain. The main switch loop looks like this now: if (opcode >= LOAD_FASTER) { load_fast(opcode - LOAD_FASTER); ... goto fast_next_opcode; } switch(opcode) { case LOAD_ATTR: oparg = NEXTARG(); w = GETITEM(names, oparg); ... break; ... } Each opcode case now loads its own argument as necessary. The test for HAVE_ARGUMENT is now implemented using an array of bytes. The test now happens very infrequently, so any performance loss is negligible. const char HASARG[] = { 0 , /* STOP_CODE */ 1 , /* LOAD_ATTR */ 1 , /* CALL_FUNCTION */ 1 , /* STORE_FAST */ 0 , /* BINARY_ADD */ 0 , /* SLICE+0 */ 0 , /* SLICE+1 */ 0 , /* SLICE+2 */ ... }
- Previous message: [Python-Dev] Re: new bytecode results
- Next message: [Python-Dev] python-dev from 2003-02-16 through 2003-02-28
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]