A small compiler that can convert Python scripts to pickle bytecode.
- Python 3.8+
No third-party modules are required.
Using pip:
$ pip install pickoraFrom source:
$ git clone https://github.com/splitline/Pickora.git
$ cd Pickora
$ python setup.py installCompile from a string:
$ pickora -c 'from builtins import print; print("Hello, world!")' -o output.pkl
$ python -m pickle output.pkl # load the pickle bytecode
Hello, world!
NoneCompile from a file:
$ echo 'from builtins import print; print("Hello, world!")' > hello.py
$ pickora hello.py # output compiled pickle bytecode to stdout directly
b'\x80\x04\x95(\x00\x00\x00\x00\x00\x00\x00\x8c\x08builtins\x8c\x05print\x93\x94\x94h\x01\x8c\rHello, world!\x85R.'usage: pickora [-h] [-c CODE] [-p PROTOCOL] [-e] [-O] [-o OUTPUT] [-d] [-r]
[-f {repr,raw,hex,base64,none}]
[source]
A toy compiler that can convert Python scripts into pickle bytecode.
positional arguments:
source source code file
optional arguments:
-h, --help show this help message and exit
-c CODE, --code CODE source code string
-p PROTOCOL, --protocol PROTOCOL
pickle protocol
-e, --extended enable extended syntax (trigger find_class)
-O, --optimize optimize pickle bytecode (with pickletools.optimize)
-o OUTPUT, --output OUTPUT
output file
-d, --disassemble disassemble pickle bytecode
-r, --run run (load) pickle bytecode immediately
-f {repr,raw,hex,base64,none}, --format {repr,raw,hex,base64,none}
output format, none means no output
Basic usage: `pickora samples/hello.py` or `pickora --code 'print("Hello, world!")' --extended`
- Basic types: int, float, bytes, string, dict, list, set, tuple, bool, None
- Assignment:
val = dict_['x'] = obj.attr = 'meow' - Augmented assignment:
x += 1 - Named assignment:
(x := 1337) - Unpacking:
a, b, c = 1, 2, 3 - Function call:
f(arg1, arg2)- Doesn't support keyword argument.
- Import
from module import things(directly usingSTACK_GLOBALSbytecode)
- Macros (see below for more details)
STACK_GLOBALGLOBALINSTOBJNEWOBJNEWOBJ_EXBUILD
Note: All extended syntaxes are implemented by importing other built-in modules. So with this option will trigger
find_classwhen loading the pickle bytecode.
- Attributes:
obj.attr(usingbuiltins.getattronly when you need to "load" an attribute) - Operators (using
operatormodule)- Binary operators:
+,-,*,/etc. - Unary operators:
not,~,+val,-val - Compare:
0 < 3 > 2 == 2 > 1(usingbuiltins.allfor chained comparing) - Subscript:
list_[1:3],dict_['key'](usingbuiltins.slicefor slice) - Boolean operators (using
builtins.next,builtins.filter)- and: using
operator.not_ - or: using
operator.truth (a or b or c)->next(filter(truth, (a, b, c)), c)(a and b and c)->next(filter(not_, (a, b, c)), c)
- and: using
- Binary operators:
- Import
import module(usingimportlib.import_module)
- Lambda
lambda x,y=1: x+y- Using
types.CodeTypeandtypes.FunctionType - [Known bug] If any global variables are changed after the lambda definition, the lambda function won't see those changes.
There are currently 4 macros available: STACK_GLOBAL, GLOBAL, INST and BUILD.
Example:
function_name = input("> ") # > system
func = STACK_GLOBAL('os', function_name) # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077Behaviour:
- PUSH modname
- PUSH name
- STACK_GLOBAL
Example:
func = GLOBAL("os", "system") # <built-in function system>
func("date") # Tue Jan 13 33:33:37 UTC 2077Behaviour:
Simply write this piece of bytecode: f"c{modname}\n{name}\n"
Example:
command = input("cmd> ") # cmd> date
INST("os", "system", (command,)) # Tue Jan 13 33:33:37 UTC 2077Behaviour:
- PUSH a MARK
- PUSH
argsby order - Run this piece of bytecode:
f'i{modname}\n{name}\n'
stateis forinst.__setstate__(state)andslotstateis for setting attributes.
Example:
from collections import _collections_abc
BUILD(_collections_abc, None, {'__all__': ['ChainMap', 'Counter', 'OrderedDict']})Behaviour:
- PUSH
inst - PUSH
(state, slotstate)(tuple) - PUSH
BUILD
RTFM.
It's cool.
No, not at all, it's definitely useless.
Yep, it's cool garbage.
No. All pickle can do is just simply define a variable or call a function, so this kind of syntax wouldn't exist.
But if you want to do things like:
ans = input("Yes/No: ")
if ans == 'Yes':
print("Great!")
elif ans == 'No':
exit()It's still achievable! You can rewrite your code like this:
from functools import partial
condition = {'Yes': partial(print, 'Great!'), 'No': exit}
ans = input("Yes/No: ")
condition.get(ans, repr)()ta-da!
For the loop syntax, you can try to use map / starmap / reduce etc .
And yes, you are right, it's functional programming time!