PyInstaller打包Python代码与其不安全因素验证

一般情况下,写完的代码如果需要拿出去用,是不希望给对方源代码的,这个大家应该都懂,我们一般是给对方提供一个可执行文件或者便于调用的SO库等,那么今天我们来使用PyInstaller进行打包,看一下效果。

说一下系统环境: Ubuntu 18.04

打包代码

先看一段利用numpy进行矩阵乘法计算为例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/bin/python3
# -*- coding=utf8 -*-

import numpy as np


class Calc:
def __init__(self):
self.a_array = [[1, 0], [0, 1]]
self.b_array = [[4, 1], [2, 2]]

def calc_matmul(self):
return np.matmul(self.a_array, self.b_array)


def main():
c = Calc()
result = c.calc_matmul()
print(result)


if __name__ == '__main__':
main()

现在我们开始安装PyInstaller, 关于PyInstaller的选项含义可以去官方文档查看​:​

1
pip install Pyinstaller

这里我选择的是onefile模式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
$ pyinstaller --onefile calc.py 
25 INFO: PyInstaller: 3.6
25 INFO: Python: 3.7.5
25 INFO: Platform: Linux-5.3.0-40-generic-x86_64-with-Ubuntu-18.04-bionic
25 INFO: wrote /home/top/CodeWork/Python/ABC/calc.spec
28 INFO: UPX is not available.
30 INFO: Extending PYTHONPATH with paths
['/home/top/CodeWork/Python/ABC', '/home/top/CodeWork/Python/ABC']
30 INFO: checking Analysis
30 INFO: Building Analysis because Analysis-00.toc is non existent
30 INFO: Initializing module dependency graph...
31 INFO: Caching module graph hooks...
34 INFO: Analyzing base_library.zip ...
2423 INFO: Caching module dependency graph...
2479 INFO: running Analysis Analysis-00.toc
2541 INFO: Analyzing /home/top/CodeWork/Python/ABC/calc.py
3134 INFO: Processing pre-find module path hook distutils
3134 INFO: distutils: retargeting to non-venv dir '/usr/lib/python3.7'
4070 INFO: Processing pre-safe import module hook setuptools.extern.six.moves
4985 INFO: Processing pre-find module path hook site
4985 INFO: site: retargeting to fake-dir '/home/top/CodeWork/Python/ABC/venv/lib/python3.7/site-packages/PyInstaller/fake-modules'
6414 INFO: Processing module hooks...
6414 INFO: Loading module hook "hook-lib2to3.py"...
6415 INFO: Loading module hook "hook-sysconfig.py"...
6422 INFO: Loading module hook "hook-pkg_resources.py"...
7336 INFO: Processing pre-safe import module hook win32com
7480 INFO: Excluding import '__main__'
7481 INFO: Removing import of __main__ from module pkg_resources
7481 INFO: Loading module hook "hook-numpy.core.py"...
7582 INFO: Loading module hook "hook-encodings.py"...
7618 INFO: Loading module hook "hook-distutils.py"...
7619 INFO: Loading module hook "hook-xml.py"...
7658 INFO: Loading module hook "hook-numpy.py"...
7658 INFO: Loading module hook "hook-pydoc.py"...
7658 INFO: Loading module hook "hook-setuptools.py"...
8246 INFO: checking Tree
8246 INFO: Building Tree because Tree-00.toc is non existent
8246 INFO: Building Tree Tree-00.toc
8247 INFO: Looking for ctypes DLLs
8470 INFO: Analyzing run-time hooks ...
8477 INFO: Including run-time hook 'pyi_rth_pkgres.py'
8478 INFO: Including run-time hook 'pyi_rth_multiprocessing.py'
8486 INFO: Looking for dynamic libraries
8913 INFO: Looking for eggs
8913 INFO: Python library not in binary dependencies. Doing additional searching...
8988 INFO: Using Python library /usr/lib/x86_64-linux-gnu/libpython3.7m.so.1.0
8995 INFO: Warnings written to /home/top/CodeWork/Python/ABC/build/calc/warn-calc.txt
9031 INFO: Graph cross-reference written to /home/top/CodeWork/Python/ABC/build/calc/xref-calc.html
9044 INFO: checking PYZ
9044 INFO: Building PYZ because PYZ-00.toc is non existent
9044 INFO: Building PYZ (ZlibArchive) /home/top/CodeWork/Python/ABC/build/calc/PYZ-00.pyz
9597 INFO: Building PYZ (ZlibArchive) /home/top/CodeWork/Python/ABC/build/calc/PYZ-00.pyz completed successfully.
9603 INFO: checking PKG
9603 INFO: Building PKG because PKG-00.toc is non existent
9603 INFO: Building PKG (CArchive) PKG-00.pkg
23709 INFO: Building PKG (CArchive) PKG-00.pkg completed successfully.
23710 INFO: Bootloader /home/top/CodeWork/Python/ABC/venv/lib/python3.7/site-packages/PyInstaller/bootloader/Linux-64bit/run
23710 INFO: checking EXE
23710 INFO: Building EXE because EXE-00.toc is non existent
23710 INFO: Building EXE from EXE-00.toc
23711 INFO: Appending archive to ELF section in EXE /home/top/CodeWork/Python/ABC/dist/calc
24557 INFO: Building EXE from EXE-00.toc completed successfully.

可以看到在输出了一系列信息后,已经打包成功,新增了两个文件夹builddist,我们需要的可执行文件就在dist文件夹中。同时在上面的输出信息中,提示我这里是不支持UPX的,如果想要可执行文件尽可能小的话,可以进行配置:
https://github.com/upx/upx

为了保证可用,现在将刚生成的可执行文件拷贝到另一台机器(我这里是台裸机),分别用fileldd看一下calc的信息:

1
2
$ file calc
calc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 2.6.32, BuildID[sha1]=294d1f19a085a730da19a6c55788ec08c2187039, stripped
1
2
3
4
5
6
$ ldd calc 
linux-vdso.so.1 (0x00007ffcac9bd000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd33aacd000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd33a8b0000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd33a4bf000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd33acd1000)

验证执行:

1
2
3
4
$ chmod +x calc 
$ ./calc
[[4 1]
[2 2]]

可以看到的确是可以成功执行的,calc在执行的时候,会在tmp目录下面创建一个_MEIXXXXX的文件夹,程序结束之后会删掉。

不安全因素

上面的效果看起来极好,就一个可执行文件,不需要安装相应的依赖包等等,现在来一起来看看其不安全在哪儿?
PyInstaller安装以后,提供了一个工具pyi-archive_viewer,使用此命令可以检查使用PyInstaller构建的任何存档(PYZPKG)或任何可执行文件(.exe文件或ELFCOFF二进制文件)的内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
$ pyi-archive_viewer dist/calc 
pos, length, uncompressed, iscompressed, type, name
[(0, 247, 316, 1, 'm', 'struct'),
(247, 1117, 1827, 1, 'm', 'pyimod01_os_path'),
(1364, 4367, 9332, 1, 'm', 'pyimod02_archive'),
(5731, 7410, 18646, 1, 'm', 'pyimod03_importers'),
(13141, 1853, 4157, 1, 's', 'pyiboot01_bootstrap'),
(14994, 209, 254, 1, 's', 'pyi_rth_pkgres'),
(15203, 1066, 1757, 1, 's', 'pyi_rth_multiprocessing'),
(16269, 436, 754, 1, 's', 'calc'),
(16705, 8275, 22040, 1, 'b', '_bz2.cpython-37m-x86_64-linux-gnu.so'),
(24980, 102470, 149808, 1, 'b', '_codecs_cn.cpython-37m-x86_64-linux-gnu.so'),
(127450, 35790, 158032, 1, 'b', '_codecs_hk.cpython-37m-x86_64-linux-gnu.so'),
(163240,
9809,
26928,
1,
'b',
'_codecs_iso2022.cpython-37m-x86_64-linux-gnu.so'),
(173049, 97278, 272688, 1, 'b', '_codecs_jp.cpython-37m-x86_64-linux-gnu.so'),
(270327, 78894, 137520, 1, 'b', '_codecs_kr.cpython-37m-x86_64-linux-gnu.so'),
(349221, 63576, 112944, 1, 'b', '_codecs_tw.cpython-37m-x86_64-linux-gnu.so'),
(412797, 1774, 6216, 1, 'b', '_contextvars.cpython-37m-x86_64-linux-gnu.so'),
(414571, 52465, 126912, 1, 'b', '_ctypes.cpython-37m-x86_64-linux-gnu.so'),
(467036, 28577, 79152, 1, 'b', '_curses.cpython-37m-x86_64-linux-gnu.so'),
(495613, 62723, 179104, 1, 'b', '_decimal.cpython-37m-x86_64-linux-gnu.so'),
(558336, 11073, 29912, 1, 'b', '_hashlib.cpython-37m-x86_64-linux-gnu.so'),
(569409, 32233, 66472, 1, 'b', '_json.cpython-37m-x86_64-linux-gnu.so'),
(601642, 13561, 33592, 1, 'b', '_lzma.cpython-37m-x86_64-linux-gnu.so'),
(615203,
24109,
56600,
1,
'b',
'_multibytecodec.cpython-37m-x86_64-linux-gnu.so'),
(639312,
6007,
15856,
1,
'b',
'_multiprocessing.cpython-37m-x86_64-linux-gnu.so'),
(645319, 2041, 6280, 1, 'b', '_opcode.cpython-37m-x86_64-linux-gnu.so'),
(647360, 5266, 16888, 1, 'b', '_queue.cpython-37m-x86_64-linux-gnu.so'),
(652626, 42227, 116568, 1, 'b', '_ssl.cpython-37m-x86_64-linux-gnu.so'),
(694853, 30120, 66728, 1, 'b', 'libbz2.so.1.0'),
(724973, 1296821, 2917216, 1, 'b', 'libcrypto.so.1.1'),
(2021794, 70921, 202880, 1, 'b', 'libexpat.so.1'),
(2092715, 15792, 31032, 1, 'b', 'libffi.so.6'),
(2108507, 289465, 1023960, 1, 'b', 'libgfortran-ed201abd.so.3.0.0'),
(2397972, 79560, 153984, 1, 'b', 'liblzma.so.5'),
(2477532, 96843, 227944, 1, 'b', 'libmpdec.so.2'),
(2574375, 89731, 190400, 1, 'b', 'libncursesw.so.5'),
(2664106, 8144678, 29724672, 1, 'b', 'libopenblasp-r0-34a18dc3.3.7.so'),
(10808784, 1924269, 5075632, 1, 'b', 'libpython3.7m.so.1.0'),
(12733053, 126584, 294632, 1, 'b', 'libreadline.so.7'),
(12859637, 230870, 577312, 1, 'b', 'libssl.so.1.1'),
(13090507, 64264, 170784, 1, 'b', 'libtinfo.so.5'),
(13154771, 60099, 116960, 1, 'b', 'libz.so.1'),
(13214870, 10217, 25392, 1, 'b', 'mmap.cpython-37m-x86_64-linux-gnu.so'),
(13225087,
183969,
580203,
1,
'b',
'numpy/core/_multiarray_tests.cpython-37m-x86_64-linux-gnu.so'),
(13409056,
5453208,
21507704,
1,
'b',
'numpy/core/_multiarray_umath.cpython-37m-x86_64-linux-gnu.so'),
(18862264,
117613,
386852,
1,
'b',
'numpy/fft/_pocketfft_internal.cpython-37m-x86_64-linux-gnu.so'),
(18979877,
247676,
880560,
1,
'b',
'numpy/linalg/_umath_linalg.cpython-37m-x86_64-linux-gnu.so'),
(19227553,
31084,
112928,
1,
'b',
'numpy/linalg/lapack_lite.cpython-37m-x86_64-linux-gnu.so'),
(19258637,
260469,
839751,
1,
'b',
'numpy/random/_bit_generator.cpython-37m-x86_64-linux-gnu.so'),
(19519106,
612717,
2055697,
1,
'b',
'numpy/random/_bounded_integers.cpython-37m-x86_64-linux-gnu.so'),
(20131823,
391914,
1336140,
1,
'b',
'numpy/random/_common.cpython-37m-x86_64-linux-gnu.so'),
(20523737,
915600,
3146458,
1,
'b',
'numpy/random/_generator.cpython-37m-x86_64-linux-gnu.so'),
(21439337,
143505,
441589,
1,
'b',
'numpy/random/_mt19937.cpython-37m-x86_64-linux-gnu.so'),
(21582842,
105117,
313851,
1,
'b',
'numpy/random/_pcg64.cpython-37m-x86_64-linux-gnu.so'),
(21687959,
123509,
378648,
1,
'b',
'numpy/random/_philox.cpython-37m-x86_64-linux-gnu.so'),
(21811468,
80675,
226814,
1,
'b',
'numpy/random/_sfc64.cpython-37m-x86_64-linux-gnu.so'),
(21892143,
683210,
2363744,
1,
'b',
'numpy/random/mtrand.cpython-37m-x86_64-linux-gnu.so'),
(22575353, 11621, 31752, 1, 'b', 'readline.cpython-37m-x86_64-linux-gnu.so'),
(22586974, 5004, 15656, 1, 'b', 'resource.cpython-37m-x86_64-linux-gnu.so'),
(22591978, 7944, 24968, 1, 'b', 'termios.cpython-37m-x86_64-linux-gnu.so'),
(22599922, 217880, 780923, 1, 'x', 'base_library.zip'),
(22817802, 615, 4086, 1, 'x', 'include/python3.7m/pyconfig.h'),
(22818417,
20139,
83035,
1,
'x',
'lib/python3.7/config-3.7m-x86_64-linux-gnu/Makefile'),
(22838556, 2445, 6562, 1, 'x', 'lib2to3/Grammar.txt'),
(22841001, 419, 793, 1, 'x', 'lib2to3/PatternGrammar.txt'),
(22841420, 3469029, 3469029, 0, 'z', 'PYZ-00.pyz')]
?

在上面这一系列输出中可以看出环境下安装的一些依赖,比如我安装的numpy,还有两个极为重要的文件:calcPYZ-00.pyz。这两个东西,跟源代码息息相关。

验证不安全因素

继续操作,通过pyi-archive_viewer提取calc可执行文件为pyc文件:

1
2
? x calc
to filename? calc.pyc

现在得到一个calc.pyc文件,都已经拿到了pyc文件了,那么如果源代码没经过混淆或者其他处理的话,就直接可以得到一毛一样的源码。

接下来安装超级著名的反编译工具uncompyle6:

1
pip install uncompyle6

直接执行:

1
2
3
4
5
6
7
$ uncompyle6 calc.pyc 
Traceback (most recent call last):
File "/home/top/CodeWork/Python/ABC/venv/lib/python3.7/site-packages/xdis/load.py", line 143, in load_module_from_file_object
float_version = float(magics.versions[magic][:3])
KeyError: b'\xe3\x00\x00\x00'

During handling of the above exception, another exception occurred:

会发现报错了,这是因为在pyc文件中缺少了相应的magic,怎么获取magic呢?

Python代码在执行的时候会生成一个__pycache__目录,里面会有calc.cpython-37.pyc文件,现在分别用hexdump查看两个文件:

1
2
3
4
5
6
7
$ hexdump -C __pycache__/calc.cpython-37.pyc 
00000000 42 0d 0d 0a 00 00 00 00 d3 b9 63 5e 74 01 00 00 |B.........c^t...|
00000010 e3 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 |................|
00000020 00 40 00 00 00 73 30 00 00 00 64 00 64 01 6c 00 |.@...s0...d.d.l.|
00000030 5a 01 47 00 64 02 64 03 84 00 64 03 83 02 5a 02 |Z.G.d.d...d...Z.|
00000040 64 04 64 05 84 00 5a 03 65 04 64 06 6b 02 72 2c |d.d...Z.e.d.k.r,|
00000050 65 03 83 00 01 00 64 01 53 00 29 07 e9 00 00 00 |e.....d.S.).....|
1
2
3
4
5
6
$ hexdump -C calc.pyc 
00000000 e3 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 |................|
00000010 00 40 00 00 00 73 30 00 00 00 64 00 64 01 6c 00 |.@...s0...d.d.l.|
00000020 5a 01 47 00 64 02 64 03 84 00 64 03 83 02 5a 02 |Z.G.d.d...d...Z.|
00000030 64 04 64 05 84 00 5a 03 65 04 64 06 6b 02 72 2c |d.d...Z.e.d.k.r,|
00000040 65 03 83 00 01 00 64 01 53 00 29 07 e9 00 00 00 |e.....d.S.).....|

可以发现,在两个文件的靠前部分的内容基本一致,calc.pyccalc.cpython-37.pyc相比缺了第一行,缺的这一行就是所谓的版本magic,直接对calc.pyc进行编辑,将calc.cpython-37.pyc第一行的magic加上即可。这里要注意下,Python的不同版本magic是不同的。

那么再次执行uncompyle6,就可以获得源码了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
$ uncompyle6 calc.pyc 
# uncompyle6 version 3.6.4
# Python bytecode 3.7 (3394)
# Decompiled from: Python 3.7.5 (default, Nov 7 2019, 10:50:52)
# [GCC 8.3.0]
# Embedded file name: calc.py
# Size of source mod 2**32: 372 bytes
import numpy as np

class Calc:

def __init__(self):
self.a_array = [
[
1, 0], [0, 1]]
self.b_array = [[4, 1], [2, 2]]

def calc_matmul(self):
return np.matmul(self.a_array, self.b_array)


def main():
c = Calc()
result = c.calc_matmul()
print(result)


if __name__ == '__main__':
main()
# okay decompiling calc.pyc

由此可见,想使用PyInstaller作为隐藏源码的方式还是不安全的,对反编译来说只是稍微有一点点门槛罢了,那么咋提高一些呢,我想是可以结合Cython去做这件事的。

参考链接

1、https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.matmul.html
2、https://pyinstaller.readthedocs.io/en/stable/usage.html

0%