|
|
Created:
Aug. 16, 2016, 2:59 p.m. by Wladimir Palant Modified:
Aug. 17, 2016, 7:37 p.m. Reviewers:
Sebastian Noack Visibility:
Public. |
DescriptionIssue 4339 - Replace M2Crypto by PyCrypto
Repository: hg.adblockplus.org/buildtools
Patch Set 1 #
Total comments: 14
Patch Set 2 : Addressed comments #
Total comments: 2
Patch Set 3 : Merged cert and private key extraction #
Total comments: 4
Patch Set 4 : Addressed more nits #
MessagesTotal messages: 12
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:208: '--data-to-sign', digest_filename, For reference: despite the misleading name, --data-to-sign won't actually write the raw data to the file but a SHA1 digest of it. So it's essentially the same thing as --digestinfo-to-sign, only that --digestinfo-to-sign will also ASN.1-encode the digest. Getting ASN.1 encoding up front makes sense when dealing with essentially broken RSA implementations like `openssl rsautl -sign` but this seems to be the only signing library which will sign without doing ASN.1 encoding (unlike the recommended `openssl pkeyutl -sign` for example). So PyCrypto doesn't support the scenario where it would get an already encoded digest, at least it could be convinced to accept an already pre-calculated SHA1 digest. One would think that `xar --dump-toc-raw` would produce the data to be signed so that tricks with pre-calculated hashes wouldn't be necessary. This seems to be close to what's being hashed (in particular, xar definitely hashes compressed data) but not quite it. In particular, the output produced here contains the toc hash itself - so it seems to go too far.
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:208: '--data-to-sign', digest_filename, On 2016/08/16 15:12:08, Wladimir Palant wrote: > One would think that `xar --dump-toc-raw` would produce the data to be signed so > that tricks with pre-calculated hashes wouldn't be necessary. This seems to be > close to what's being hashed (in particular, xar definitely hashes compressed > data) but not quite it. In particular, the output produced here contains the toc > hash itself - so it seems to go too far. Actually, I finally realized what's going on there. `xar --dump-header` produces the following output: size: 28 Compressed TOC length: 15572 That's the correct offset and length of the TOC, retrieving this block from the file gives me the right hash. However, `xar --dump-toc-raw` produces a slice of the file with the right length but starting at offset 64. Consequently, the data is useless. This functionality isn't part of the original xar tool but has been rather added by Kyle J. McKay - and it's simply buggy. I filed https://github.com/mackyle/xar/issues/11 on this.
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:120: keydata = re.sub(r'(-+END PRIVATE KEY-+).*', r'\1', data, flags=re.S) Is this even necessary? I would expect RSA.importKey() to simply ignore everything that is not a key in the file. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:131: def get_sequence(data): Any reason this is a nested function, and not just defined on the top-level? https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:138: tbsCertificate = get_sequence(cert)[0] Nit: camel case
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:120: keydata = re.sub(r'(-+END PRIVATE KEY-+).*', r'\1', data, flags=re.S) On 2016/08/16 16:32:05, Sebastian Noack wrote: > Is this even necessary? I would expect RSA.importKey() to simply ignore > everything that is not a key in the file. Yes, I would expect that as well. However, I get "RSA key format not supported" if I try it with the extra data. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:131: def get_sequence(data): On 2016/08/16 16:32:05, Sebastian Noack wrote: > Any reason this is a nested function, and not just defined on the top-level? It isn't exactly a general-purpose function - merely a tiny helper for this particular function. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:138: tbsCertificate = get_sequence(cert)[0] On 2016/08/16 16:32:05, Sebastian Noack wrote: > Nit: camel case I know. But that's how this field is named in the spec ;)
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:120: keydata = re.sub(r'(-+END PRIVATE KEY-+).*', r'\1', data, flags=re.S) I just noticed that this code assumes the key to be at the beginning. This wasn't case before. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:131: def get_sequence(data): On 2016/08/16 17:06:04, Wladimir Palant wrote: > On 2016/08/16 16:32:05, Sebastian Noack wrote: > > Any reason this is a nested function, and not just defined on the top-level? > > It isn't exactly a general-purpose function - merely a tiny helper for this > particular function. It doesn't need to be general-purpose to go on the top-level. Feel free to prefix the function to mark it private. I generally think that nested functions, make the code more complicated. And as per the Zen of Python, "flat is better than nested". https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:138: tbsCertificate = get_sequence(cert)[0] On 2016/08/16 17:06:04, Wladimir Palant wrote: > On 2016/08/16 16:32:05, Sebastian Noack wrote: > > Nit: camel case > > I know. But that's how this field is named in the spec ;) That doesn't seem a reason to me, to violate PEP-8. The code is as understandable if you just adapt it to underscore notation.
https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:120: keydata = re.sub(r'(-+END PRIVATE KEY-+).*', r'\1', data, flags=re.S) On 2016/08/17 08:43:07, Sebastian Noack wrote: > I just noticed that this code assumes the key to be at the beginning. This > wasn't case before. True, PyCrypto seems to be angry about certificates before the private key as well. Fixed that. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:131: def get_sequence(data): On 2016/08/17 08:43:07, Sebastian Noack wrote: > On 2016/08/16 17:06:04, Wladimir Palant wrote: > > On 2016/08/16 16:32:05, Sebastian Noack wrote: > > > Any reason this is a nested function, and not just defined on the top-level? > > > > It isn't exactly a general-purpose function - merely a tiny helper for this > > particular function. > > It doesn't need to be general-purpose to go on the top-level. Feel free to > prefix the function to mark it private. I generally think that nested functions, > make the code more complicated. And as per the Zen of Python, "flat is better > than nested". Done. https://codereview.adblockplus.org/29349869/diff/29349870/packagerSafari.py#n... packagerSafari.py:138: tbsCertificate = get_sequence(cert)[0] On 2016/08/17 08:43:07, Sebastian Noack wrote: > On 2016/08/16 17:06:04, Wladimir Palant wrote: > > On 2016/08/16 16:32:05, Sebastian Noack wrote: > > > Nit: camel case > > > > I know. But that's how this field is named in the spec ;) > > That doesn't seem a reason to me, to violate PEP-8. The code is as > understandable if you just adapt it to underscore notation. Done.
https://codereview.adblockplus.org/29349869/diff/29349896/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349896/packagerSafari.py#n... packagerSafari.py:127: for match in re.finditer(CERTIFICATE_REGEXP, data, re.S): I think it might make sense to unify the logic here: for match in re.finditer(r'-+BEGIN (.*?)-+(.*?)-+END \1-+', data, re.S): what, content = match.groups() if what == 'CERTIFICATE': certs.append(base64.b64decode(content)) if what == 'PRIVATE KEY': key = RSA.importKey(content)
https://codereview.adblockplus.org/29349869/diff/29349896/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349896/packagerSafari.py#n... packagerSafari.py:127: for match in re.finditer(CERTIFICATE_REGEXP, data, re.S): On 2016/08/17 12:19:24, Sebastian Noack wrote: > I think it might make sense to unify the logic here: > > for match in re.finditer(r'-+BEGIN (.*?)-+(.*?)-+END \1-+', data, re.S): > what, content = match.groups() > if what == 'CERTIFICATE': > certs.append(base64.b64decode(content)) > if what == 'PRIVATE KEY': > key = RSA.importKey(content) Done.
https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py#n... packagerSafari.py:127: if key is None: Nit: Use |not x| instead |x is None| if it doesn't make a difference (it doesn't here), so that it reads more like human language, hence is easier to parse by humans. https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py#n... packagerSafari.py:128: raise Exception('Cound not find private key in file') Typo: Cound -> Couldn't
https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py File packagerSafari.py (right): https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py#n... packagerSafari.py:127: if key is None: On 2016/08/17 18:34:11, Sebastian Noack wrote: > Nit: Use |not x| instead |x is None| if it doesn't make a difference (it doesn't > here), so that it reads more like human language, hence is easier to parse by > humans. Done. https://codereview.adblockplus.org/29349869/diff/29349942/packagerSafari.py#n... packagerSafari.py:128: raise Exception('Cound not find private key in file') On 2016/08/17 18:34:11, Sebastian Noack wrote: > Typo: Cound -> Couldn't Done.
LGTM |