Issue 29317015: Issue 2625 - [cms] Crowdin synchronisation script

kzar

Patch Set 1 https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py File cms/converters.py (right): https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#newcode128 cms/converters.py:128: def localize_string(self, name, default, comment, localedata, ...

June 15, 2015, 2:24 p.m. (2015-06-15 14:24:28 UTC) #1

Wladimir Palant

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py#newcode19 cms/bin/translate.py:19: from itertools import islice Nit: I'm not a big ...

June 29, 2015, 7:05 p.m. (2015-06-29 19:05:38 UTC) #2

kzar

Patch Set 2 : Addressed Wladimir's feedback https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py#newcode19 cms/bin/translate.py:19: from itertools ...

July 2, 2015, 12:33 p.m. (2015-07-02 12:33:14 UTC) #3

Patch Set 2 : Addressed Wladimir's feedback

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:19: from itertools import islice
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: I'm not a big fan of importing individual functions, other than in cases
> where function name and module name are identical - when looking at a function
> call, it becomes less obvious that the function is imported from somewhere
else
> as opposed to being defined in this file. Writing io.BytesIO() isn't much
longer
> than BytesIO() but makes the code more obvious.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:36: defaultlocale = None
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> The three variables above shouldn't be globals - they refer to a specific
> invocation of crowdin_sync and should be passed in a parameters wherever
> appropriate.
> 
> Note that you can encapsulate API key and project name in a CrowdinAPI class -
> then you only need to pass in an instance of that class, and the functions can
> call api.GET("info") for example without any extra parameters.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:45: break
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Please link to
>
https://stackoverflow.com/questions/3992735/python-generator-that-groups-anot...
> which is where you've got that code it seems.
> 
> IMPORTANT: *Always* indicate code that you didn't write yourself.

I originally did take this function from Stackoverflow and I added a comment to
indicate that
https://github.com/kzar/cms/commit/4196784#diff-d9a5accf75fa7c66e418d6e9b1c78...
. Since then I was unhappy with the implementation and the others I found on
Stackoverflow and I wrote this one myself, mixing together parts from various
examples.

I was unsure weather to leave the comment in or not before posting the review as
I was no longer left with code that exactly matched any of the examples. I
decided to leave it out and I should have mentioned that in the review, sorry.
I'm happy to either add or leave the comment out, I agree it's very important to
give proper attribution.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:51: response = requests.request(request_method, url,
**kwargs)
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> You seem to be assuming that this will not throw an exception. However,
> HTTPError is about HTTP response codes like 404, but there is also
> ConnectionError and the like - e.g. if the host name could not be resolved.
I'm
> pretty sure that those will be raised when requests.request() is called.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:57: return response.json()
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> This call might also raise an exception - if JSON decoding fails. You need to
> intercept it as well if you want to log the error consistently.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:73: def ensure_required_locales(required_locales,
enabled_locales, defaultlocale):
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: configure_locales maybe? This is more about configuring the project than
> anything else.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:78: skipped_locales =
list(required_locales.difference(supported_locales))
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: why convert a set to list here? Just change skipped_locales.append() into
> skipped_locales.add() below and it should work.
> 
> Nit: skipped_locales = required_locales - supported_locales?

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:84: required_locales =
required_locales.intersection(supported_locales)
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: required_locales = required_locales & supported_locales? Or maybe even
more
> obvious: required_locales -= skipped_locales.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:93: data={"languages[]":
list(enabled_locales.union(required_locales))}
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: list(enabled_locales | required_locales) please.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:110: map(parse_file_node, project_info["files"])
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> You are misusing map() here, its callback should not have side-effects - the
> result returned should rather be used. I think that here the best option is to
> keep it simple:
> 
> for node in project_info["files"]:
>   parse_file_node(node)

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:125: path += "/"
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> I think the following should be simpler:
> 
>   local_files.add(page + ".json")
>   while "/" in page:
>     page = page.rsplit("/", 1)[0]
>     local_directories.add(page)

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:137: page = file_name[:-5]
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Use os.path.splitext()?

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:139: del page_strings[page]
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Why delete the strings?

We no longer need them, and as they might be quite large I thought it was worth
deleting them to avoid wasting memory.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:163: "files for locale %s..." % (len(files), locale))
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Uploading per locale might result in doing one request for each locale. You
> could compile the complete list instead:
> 
>   def open_files():
>     for locale in required_locales:
>       for file in files:
>         path = os.path.join(source_dir, "locales", locale, file)
>         if os.path.isfile(path):
>           yield ("files[%s]" % file, open(path, "r"))

This is deliberate, when uploading translations you are allowed to upload
multiple files at once but they must all be for the same language.
https://crowdin.com/page/api/upload-translation

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:167: f[1].close()
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> These files should be closed regardless of whether the request fails. Meaning:
> 
>   try:
>     logger.info(...)
>     crowdin_request(...)
>   finally:
>     for fieldname, file in files:
>       file.close()
> 
> Note that I'd really prefer deconstructing the tuples properly in the loop
> instead of using f[1].

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:205: archive.extractall(locale_path)
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Running extractall is pretty dangerous IMHO. I'd suggest iterating over all
> files, checking that they are valid (not default locale, ending with .json),
> then extracting them individually.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:216: project_info = crowdin_request("GET", "info")
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Nit: The two lines above and setting enabled_locales can be done outside the
> with block. In fact, this seems to belong logically into the
> ensure_required_locales function - it should just check which locales are
> already configured.

I originally didn't request the project information until after extracting the
page strings but during testing I found it to be a real pain. What would often
happen was the API key, project name or something else was wrong and instead of
getting immediate feedback you had to wait several minutes to find out. This way
you know immediately if there's a problem.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:222: required_locales, skipped_locales =
ensure_required_locales(
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Why do we need to know the skipped locales here? The warning has been shown
> already, and ignoring them in the download seems to make no sense - the
locales
> are skipped because they aren't supported by Crowdin, meaning that they cannot
> be part of the download.

We need to avoid deleting locale files for skipped_locales, otherwise
pre-existing translations for an unsupported locale will be lost. (Perhaps the
locale directory just needs to be renamed so the locale matches what Crowdin
expects?)

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:230: logger.error("No page strings found. (Wrong project
directory?) Aborting!")
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Nit: The settings.ini file is there, so it cannot be a wrong directory. "No
> existing strings found, maybe project directory not set up correctly?"

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:234: new_directories =
list(local_directories.difference(remote_directories))
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> Nit: I don't see the point converting sets to lists here. Also, can just use
the
> minus operator.

(Cool I found this one myself and later noticed your comment :) )
Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:239: existing_files =
list(local_files.intersection(remote_files))
On 2015/06/29 19:05:36, Wladimir Palant wrote:
> Nit: existing_files = local_files - new_files?

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py
File cms/converters.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:151: if locale == self._params["defaultlocale"]:
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> This case is already checked above (getting translation), just move your code
up
> there.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:157: "[" + ", ".join(["%d: %s" % (i, s) for i, s in
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> ", ".join("{%d}: %s" (i, s) for i, s in ...)
> 
> In other words: You don't need to create a temporary list here, a generator
will
> do. Also, we can specifically mention {1}, no need to expect the translators
to
> make the connection from numbers to placeholders.

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:159: )
On 2015/06/29 19:05:37, Wladimir Palant wrote:
> And what about comments for strings without fixed parts?

Whoops, good point.
Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:199: # Note: The comment, it is only relevant when uploading
translations
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> This is the wrong place for this comment, it is describing function
> localize_string().

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:365: # Note: The comment, it is only relevant when uploading
translations
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> Same here, this comment should be moved into function localize_string().

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/converters.py#n...
cms/converters.py:370: default =
self._params["source"].read_locale(self._params["locale"], page)[name]
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> On 2015/06/15 14:24:28, kzar wrote:
> > This change is required so that strings included with the get_strings global
> > function are stored with the parent page. That way the translators have
proper
> > context for these strings and locale files for the "master" language do not
> have
> > to be listed and read manually when synching with Crowdin.
> 
> This change also happens to be wrong. Each page in the
http://web.adblockplus.org
> repository calls get_string("installation", "menu") - our translations have it
> defined in the file menu.json rather than duplicated everywhere.
> 
> So, as you noticed here, merely filling up params["localedata"] doesn't work -
> sometimes pages use strings from multiple locale files. What we probably need
is
> some kind of callback:
> 
>   if "record_default" in self._params:
>     self._params["record_default"](name, default, comment, fixed_strings)
> 
> This callback can be specified as an optional parameter to
> cms.utils.get_page_params().

Done.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/sources.py#newc...
cms/sources.py:173: result = OrderedDict()
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> On 2015/06/15 14:24:28, kzar wrote:
> > This is required to keep the correct order of page strings uploaded to
> Crowdin.
> > Otherwise I found a lot of context was lost when viewing a page's strings in
> > Crowdin.
> 
> Crowdin isn't very good at keeping that order but it won't hurt...

I found that it does help in practice for our website.

https://codereview.adblockplus.org/29317015/diff/29317016/cms/utils.py
File cms/utils.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/utils.py#newcode33
cms/utils.py:33: "localecomments": {},
On 2015/06/29 19:05:38, Wladimir Palant wrote:
> This should no longer be necessary if locale data is collected via callback.

Done.

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests I see, we use "requests", because urllib ...

July 8, 2015, 1:03 p.m. (2015-07-08 13:03:21 UTC) #4

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests I just realized that simply using built-in ...

July 8, 2015, 2:23 p.m. (2015-07-08 14:23:00 UTC) #5

kzar

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests On 2015/07/08 14:23:00, Sebastian Noack wrote: > ...

July 8, 2015, 3:26 p.m. (2015-07-08 15:26:41 UTC) #6

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests On 2015/07/08 15:26:41, kzar wrote: > On ...

July 8, 2015, 3:43 p.m. (2015-07-08 15:43:50 UTC) #7

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:26: import requests
On 2015/07/08 15:26:41, kzar wrote:
> On 2015/07/08 14:23:00, Sebastian Noack wrote:
> > I just realized that simply using built-in urllib2 with a handwritten
request
> > body, doesn't seem too complicated either:
> > 
> > boundary = '--' + mimetools.choose_boundary()
> > data = [boundary, 'Content-Disposition: form-data; name="language"', '',
> locale]
> > for i, filename in enumerate(filenames):
> >     data.append(boundary)
> >     data.append('Content-Disposition: file; name="files[%d]"; filename="%d"'
%
> > (i, i))
> >     data.append('')
> >     with open(filename, 'rb') as file:
> >         data.append(file.read())
> > data.append(boundary + '--')
> > data.append('')
> > body = '\r\n'.join(data)
> > 
> > request = urllib2.Request(url)
> > request.add_header('Content-Type', 'multipart/form-data; boundary=' +
> boundary)
> > request.add_header('Content-Length', len(body))
> > request.set_data(body)
> > response = urllib2.urlopen(request)
> 
> I discussed using the requests library with Wladimir in #adblockplus before I
> started writing this code. I don't think it's helpful to continue that
> discussion now, after weeks have passed and the code has been written, refined
> and tested.

I'm certainly curious about the reason, you decided for to use "requests". This
library seems to be an unneeded wrapper around "urllib3", which seems to be
unneeded for the use case here, in the first place.

Usually, we (and in particular Wladimir) are opposed to introduce third-party
modules. And if we introduce a new dependency there need to be a good reason to
justify that. I don't care that you discussed that with Wladimir already. Since
I wasn't involved (or invited) to the discussion back then, I will revisit it.

Wladimir Palant

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py#newcode222 cms/bin/translate.py:222: required_locales, skipped_locales = ensure_required_locales( On 2015/07/02 12:33:11, kzar wrote: ...

July 8, 2015, 11:11 p.m. (2015-07-08 23:11:09 UTC) #8

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:222: required_locales, skipped_locales =
ensure_required_locales(
On 2015/07/02 12:33:11, kzar wrote:
> We need to avoid deleting locale files for skipped_locales,

You can (and should) limit deleting to required_locales.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:26: import requests
Doing multipart encoding manually is very awkward. While we do that in
buildtools to avoid introducing a dependency, here we have something that will
be run by the server rather than contributors - so a dependency is better than
adding code complexity.

We use requests because that's what Stack Overflow recommends :). I wasn't aware
that it is a wrapper around urllib3 and indeed - using urllib3 would make more
sense IMHO. While the requests API might be more convenient, just using
urllib3.filepost.encode_multipart_formdata() along with built-in APIs that any
Python developer knows will make this code more maintainable.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:39: self.defaultlocale = defaultlocale
It doesn't look like this field is ever used - and it seems misplaced here,
irrelevant for API requests.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:51: except requests.exceptions.ConnectionError:
There are more exception classes - ConnectionError was merely an example but
there is also Timeout for example. All of them seem to inherit from RequestError
however so catching this one will work.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:76: store = page_strings.setdefault(page, {})
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> Do we care to not unnecessarily change the order of strings, e.g. to keep
diffs
> compact? Then we should use an OrderedDict.

Not necessarily relevant for diffs but Crowdin will use the order we define
here. So - yes, OrderedDict sounds like a good idea.

However, the current code will create a dictionary instance on each call, most
of the time it will be thrown away. So maybe a slightly less elegant approach is
better:

  try:
    store = page_strings[page]
  except KeyError:
    store = page_strings[page] = OrderedDict()

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:150: for group in grouper(files,
CrowdinAPI.FILES_PER_REQUEST):
No need to assume that crowdin_api is a CrowdinAPI instance. Also no need to
assume that this constant is identical for all instances. Just use
crowdin_api.FILES_PER_REQUEST.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:177: CrowdinAPI.FILES_PER_REQUEST):
As above, crowdin_api.FILES_PER_REQUEST please.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:229: archive.extract(member, locale_path)
Please use posixpath module here rather than os.path - archive paths don't
follow OS path conventions. This should also eliminate the need for calling
normpath.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py
File cms/converters.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py#n...
cms/converters.py:144: if self._params["record_default_strings"]:
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> Is there any particular reason we call that function only for the default
> locale? This shouldn't be necessary as we only generate the default locale
when
> we extract translations, right?

That's assumptions, the kind that might not be true tomorrow.

> So maybe it would be
> better and simpler to patch Converter.localize_string in cms.bin.translate
> instead passing that callback around?

Patch a method to memoize return value without any assumptions about parameters
is one thing, wrapping such a complex function something entirely different -
that would move many assumptions about its functionality into a completely
unrelated piece of code. If anything, we could set the callback as an attribute
on Converter *instances* from cms.utils.get_page_params - "inheriting" this
callback would introduce significant complexity however and doesn't seem to be
worth it.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/utils.py
File cms/utils.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/utils.py#newcode21
cms/utils.py:21: record_default_strings=None):
Nit: This should be record_default_string as the callback only records one
string at a time.

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests On 2015/07/08 23:11:05, Wladimir Palant wrote: > ...

July 9, 2015, 9:26 p.m. (2015-07-09 21:26:55 UTC) #9

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:26: import requests
On 2015/07/08 23:11:05, Wladimir Palant wrote:
> Doing multipart encoding manually is very awkward. While we do that in
> buildtools to avoid introducing a dependency, here we have something that will
> be run by the server rather than contributors - so a dependency is better than
> adding code complexity.

Personally, I'd still prefer to go with built-in urllib2 and the implementation
I suggested above. The implementation in buildtools is rather hacky, I know.
However, I think the snippet I posted above would be reasonable. Also I don't
think that it would be justified to use a third-party library just to avoid
these few lines. However, it's not important enough to argue about.

> We use requests because that's what Stack Overflow recommends :). I wasn't
aware
> that it is a wrapper around urllib3 and indeed - using urllib3 would make more
> sense IMHO. While the requests API might be more convenient, just using
> urllib3.filepost.encode_multipart_formdata() along with built-in APIs that any
> Python developer knows will make this code more maintainable.

Yeah, if you both disagree to go the way I suggested above, I agree that using
urllib3 or maybe even better, as you suggested, just urllib/urllib2 in
combination with urllib3.filepost.encode_multipart_formdata() would certainly be
better than using "requests".

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:88: format=format, record_default_strings=record_string)
Recording the default strings is what we do here. But the converter should be
agnostic of it. For it it's just a callback to call on every string. And that is
what the argument name should reflect.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py
File cms/converters.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py#n...
cms/converters.py:144: if self._params["record_default_strings"]:
On 2015/07/08 23:11:08, Wladimir Palant wrote:
> On 2015/07/08 13:03:20, Sebastian Noack wrote:
> > Is there any particular reason we call that function only for the default
> > locale? This shouldn't be necessary as we only generate the default locale
> when
> > we extract translations, right?
> 
> That's assumptions, the kind that might not be true tomorrow.

Or the other way around. As far as I see it this code shouldn't be in the
default locale code path and the name shouldn't imply it. This will make the
logic of that "param" less complicated to understand.

> > So maybe it would be
> > better and simpler to patch Converter.localize_string in cms.bin.translate
> > instead passing that callback around?
> 
> Patch a method to memoize return value without any assumptions about
parameters
> is one thing, wrapping such a complex function something entirely different -
> that would move many assumptions about its functionality into a completely
> unrelated piece of code. If anything, we could set the callback as an
attribute
> on Converter *instances* from cms.utils.get_page_params - "inheriting" this
> callback would introduce significant complexity however and doesn't seem to be
> worth it.

Fair enough.

Wladimir Palant

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py#newcode26 cms/bin/translate.py:26: import requests On 2015/07/09 21:26:55, Sebastian Noack wrote: > ...

July 10, 2015, 9:24 p.m. (2015-07-10 21:24:05 UTC) #10

kzar

Patch Set 3 : Addressed further feedback https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py#newcode222 cms/bin/translate.py:222: required_locales, skipped_locales ...

July 11, 2015, 7:21 p.m. (2015-07-11 19:21:18 UTC) #11

Patch Set 3 : Addressed further feedback

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29317016/cms/bin/translate.p...
cms/bin/translate.py:222: required_locales, skipped_locales =
ensure_required_locales(
On 2015/07/08 23:11:05, Wladimir Palant wrote:
> On 2015/07/02 12:33:11, kzar wrote:
> > We need to avoid deleting locale files for skipped_locales,
> 
> You can (and should) limit deleting to required_locales.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:26: import requests
On 2015/07/10 21:24:04, Wladimir Palant wrote:
> On 2015/07/09 21:26:55, Sebastian Noack wrote:
> > Personally, I'd still prefer to go with built-in urllib2 and the
> implementation
> > I suggested above. The implementation in buildtools is rather hacky, I know.
> > However, I think the snippet I posted above would be reasonable.
> 
> Nope, they are both hacky and I'd definitely prefer some code for this that we
> don't own.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:39: self.defaultlocale = defaultlocale
On 2015/07/08 23:11:06, Wladimir Palant wrote:
> It doesn't look like this field is ever used - and it seems misplaced here,
> irrelevant for API requests.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:51: except requests.exceptions.ConnectionError:
On 2015/07/08 23:11:06, Wladimir Palant wrote:
> There are more exception classes - ConnectionError was merely an example but
> there is also Timeout for example. All of them seem to inherit from
RequestError
> however so catching this one will work.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:66: if chunk:
On 2015/07/08 13:03:19, Sebastian Noack wrote:
> Nit: You could get rid of the else block:
> 
> if not chunk:
>   break
> yield chunk

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:76: store = page_strings.setdefault(page, {})
On 2015/07/08 23:11:06, Wladimir Palant wrote:
> On 2015/07/08 13:03:20, Sebastian Noack wrote:
> > Do we care to not unnecessarily change the order of strings, e.g. to keep
> diffs
> > compact? Then we should use an OrderedDict.
> 
> Not necessarily relevant for diffs but Crowdin will use the order we define
> here. So - yes, OrderedDict sounds like a good idea.
> 
> However, the current code will create a dictionary instance on each call, most
> of the time it will be thrown away. So maybe a slightly less elegant approach
is
> better:
> 
>   try:
>     store = page_strings[page]
>   except KeyError:
>     store = page_strings[page] = OrderedDict()

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:80: comment = comment + "\n" if comment else ""
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> Nit. The ternary operator just adds uneeded complexity here. I'd go for:
> 
> if comment:
>   comment += "\n"

`comment` might be None, hence this logic.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:81: comment += ", ".join("{%d}: %s" % (i, s)
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> Nit: No reason to pack/unpack the sequence here:
> 
> .. % x for x in enumerate(..

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:88: format=format, record_default_strings=record_string)
On 2015/07/09 21:26:55, Sebastian Noack wrote:
> Recording the default strings is what we do here. But the converter should be
> agnostic of it. For it it's just a callback to call on every string. And that
is
> what the argument name should reflect.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:150: for group in grouper(files,
CrowdinAPI.FILES_PER_REQUEST):
On 2015/07/08 23:11:08, Wladimir Palant wrote:
> No need to assume that crowdin_api is a CrowdinAPI instance. Also no need to
> assume that this constant is identical for all instances. Just use
> crowdin_api.FILES_PER_REQUEST. 

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:176: for files in grouper(open_locale_files(locale,
new_files),
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> You should better first get the chunk of filenames and then open the files in
> that chunk, rather than opening all files in advance.

`open_locale_files` is a generator and I consume the groups lazily so I don't
think the files are opened in advance.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:177: CrowdinAPI.FILES_PER_REQUEST):
On 2015/07/08 23:11:05, Wladimir Palant wrote:
> As above, crowdin_api.FILES_PER_REQUEST please.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/bin/translate.p...
cms/bin/translate.py:229: archive.extract(member, locale_path)
On 2015/07/08 23:11:08, Wladimir Palant wrote:
> Please use posixpath module here rather than os.path - archive paths don't
> follow OS path conventions. This should also eliminate the need for calling
> normpath.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py
File cms/converters.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/converters.py#n...
cms/converters.py:144: if self._params["record_default_strings"]:
On 2015/07/10 21:24:04, Wladimir Palant wrote:
> On 2015/07/09 21:26:55, Sebastian Noack wrote:
> > Or the other way around.
> 
> True. If we generalize the parameter name into something like
> "localized_string_callback" then we can just as well record all strings and
let
> the callback choose which ones it needs. It should be called with |result| as
> parameter rather than |default| of course.

Done.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py#newc...
cms/sources.py:173: result = collections.OrderedDict()
On 2015/07/08 13:03:20, Sebastian Noack wrote:
> Nit: This could be simplified a little:
> 
> result = collections.OrderedDict()
> if locale != default_locale:
>   result.update(self.read_locale(default_locale, page))

I think we should leave this unrelated change out of this patchset.

https://codereview.adblockplus.org/29317015/diff/29321284/cms/utils.py
File cms/utils.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/utils.py#newcode21
cms/utils.py:21: record_default_strings=None):
On 2015/07/08 23:11:09, Wladimir Palant wrote:
> Nit: This should be record_default_string as the callback only records one
> string at a time.

Done.

kzar

Patch Set 4 : Slightly simplified request exception logic

July 12, 2015, 5:49 a.m. (2015-07-12 05:49:15 UTC) #12

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py#newcode173 cms/sources.py:173: result = collections.OrderedDict() On 2015/07/11 19:21:18, kzar wrote: > ...

July 14, 2015, 11:31 a.m. (2015-07-14 11:31:08 UTC) #13

kzar

Patch Set 5 : Addressed Sebastian's feedback https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py#newcode173 cms/sources.py:173: result = ...

July 14, 2015, 12:54 p.m. (2015-07-14 12:54:31 UTC) #14

Patch Set 5 : Addressed Sebastian's feedback

https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29317015/diff/29321284/cms/sources.py#newc...
cms/sources.py:173: result = collections.OrderedDict()
On 2015/07/14 11:31:04, Sebastian Noack wrote:
> On 2015/07/11 19:21:18, kzar wrote:
> > On 2015/07/08 13:03:20, Sebastian Noack wrote:
> > > Nit: This could be simplified a little:
> > > 
> > > result = collections.OrderedDict()
> > > if locale != default_locale:
> > >   result.update(self.read_locale(default_locale, page))
> > 
> > I think we should leave this unrelated change out of this patchset.
> 
> It's not unrelated if you change the code anyway. And in particular your
change
> is introducing the duplication here.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:45: self.project_name, api_endpoint, self.api_key
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> Please encode the parameters properly:
> 
> url = "/api/project/%s/%s?%s" % (
>   urllib.quote(self.project_name),
>   urllib.quote(api_endpoint),
>   urllib.urlencode([('key', self.api_key), ('json', '1')])
> )

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:54: fields += [(name + "[]", v) for v in value]
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> Nit: fields.extend((name + "[]", v) for v in value)

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:56: fields += [("files[%s]" % f[0], f) for f in files]
On 2015/07/14 11:31:04, Sebastian Noack wrote:
> Note that |'%s' % f| returns it object representation like |<open file
> '/dev/null', mode 'r' at 0x7fc2226df780>|. How about using simply an
> incrementing index instead? That would seem less hacky.
> 
> Also why can the part between the square brackets be empty for regular fields,
> but not for files?

So f[0] is actually the file name and we need to populate the key name for files
as that's what the API expects. (I guess it's the difference between a regular
array and an associate array.)

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:56: fields += [("files[%s]" % f[0], f) for f in files]
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> Nit: Please use .extend() here as well.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:61: timeout=urllib3.Timeout(connect=5)
On 2015/07/14 11:31:07, Sebastian Noack wrote:
> Any particular reason you specify a custom connect timeout?

During testing I found that by default it didn't seem to timeout at all.  (I
waited for several minutes!) I set the timeout to 5 seconds here as that seemed
"about right", it made things behave the same as when we were using the requests
library instead of urllib3 directly.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:63: if response.status < 200 or response.status >= 300:
On 2015/07/14 11:31:07, Sebastian Noack wrote:
> How about |response.status not in xrange(200, 299)|?

I think I prefer it as is.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:65: except urllib3.exceptions.HTTPError as e:
On 2015/07/14 11:31:06, Sebastian Noack wrote:
> Nit: Since we don't use the variable e you can omit the as-clause.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:66: logger.error("API call to %s failed:\n%s" % (url,
response.data))
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> You can pass the values for the placeholders directly to the logging function,
> rather than formatting the string here:
> 
> logger.error("API call to %s failed:\n%s", url, response.data)

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:70: return json.loads(response.data)
On 2015/07/14 11:31:04, Sebastian Noack wrote:
> How about |json.load(response)|?

This doesn't work as you would expect, even though by default decode_content
should be True[1]. I did attempt to do it this way for some time, also below
when we download the zip file but in the end I just couldn't get it to work and
to resort to using response.data instead.

[1] https://github.com/shazow/urllib3/blob/master/urllib3/response.py

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:72: logger.error("Invalid response returned by API endpoint
%s" % url)
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> Same here: logger.error("Invalid response returned by API endpoint %s", url)

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:129: data={"languages": list(enabled_locales |
required_locales)}
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> Since .request() merely iterates over the value (if it isn't a string), and
set
> objects are iterable by themselves, there is no need to turn it into a list.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:221: "/api/project/%s/download/all.zip?key=%s" % (
On 2015/07/14 11:31:07, Sebastian Noack wrote:
> As above, please use urllib.quote and urllib.urlencode here.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:225: if response.status < 200 or response.status >= 300:
On 2015/07/14 11:31:05, Sebastian Noack wrote:
> How about |response.status not in xrange(200, 299)|?

See above, I prefer it as is.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:229: with zipfile.ZipFile(io.BytesIO(response.data), "r")
as archive:
On 2015/07/14 11:31:04, Sebastian Noack wrote:
> The response is a file-like object by itself. So if you don't use the
> preload_content argument above, you can parse it to ZipFile() without wrapping
> it in a BytesIO() object.

I agree this _should_ work but in practice it just doesn't, I spent several
hours trying to get it to work and in the end gave up.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/converters.py
File cms/converters.py (right):

https://codereview.adblockplus.org/29317015/diff/29322015/cms/converters.py#n...
cms/converters.py:151: if self._params["localized_string_callback"]:
On 2015/07/14 11:31:08, Sebastian Noack wrote:
> I would rather not duplicate the key lookup:
> 
>   callback = self._params.get("localized_string_callback")
>   if callback:
>     callback(...)

Done.

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py#newcode61 cms/bin/translate.py:61: timeout=urllib3.Timeout(connect=5) On 2015/07/14 12:54:30, kzar wrote: > On 2015/07/14 ...

July 14, 2015, 2:39 p.m. (2015-07-14 14:39:29 UTC) #15

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:61: timeout=urllib3.Timeout(connect=5)
On 2015/07/14 12:54:30, kzar wrote:
> On 2015/07/14 11:31:07, Sebastian Noack wrote:
> > Any particular reason you specify a custom connect timeout?
> 
> During testing I found that by default it didn't seem to timeout at all.  (I
> waited for several minutes!) I set the timeout to 5 seconds here as that
seemed
> "about right", it made things behave the same as when we were using the
requests
> library instead of urllib3 directly.

But differently than urllib/urllib2 which we use everywhere else, right? Unless
we address a particular issue I'd rather stick to the defaults. Also I suppose
the script might still hang on "read" anyway if only applying a connect timeout.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:70: return json.loads(response.data)
On 2015/07/14 11:31:06, Sebastian Noack wrote:
> Note that like urllib/urllib2, urllib3's repsonse objects also have a close()
> method that should be called (in a finally-block).

For reference, I just realized that urllib3, magically closes the response when
it reaches EOF. Not sure if I like that behavior. But therefore closing the
response isn't necessary as opposed to urllib/urllib2.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:70: return json.loads(response.data)
On 2015/07/14 12:54:28, kzar wrote:
> On 2015/07/14 11:31:04, Sebastian Noack wrote:
> > How about |json.load(response)|?
> 
> This doesn't work as you would expect, even though by default decode_content
> should be True[1]. I did attempt to do it this way for some time, also below
> when we download the zip file but in the end I just couldn't get it to work
and
> to resort to using response.data instead.
> 
> [1] https://github.com/shazow/urllib3/blob/master/urllib3/response.py

I just tested it myself. And it did work. You have to specify
preload_content=False though. Otherwise the underlying file descriptor will
already be at EOF.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:229: with zipfile.ZipFile(io.BytesIO(response.data), "r")
as archive:
On 2015/07/14 12:54:28, kzar wrote:
> On 2015/07/14 11:31:04, Sebastian Noack wrote:
> > The response is a file-like object by itself. So if you don't use the
> > preload_content argument above, you can parse it to ZipFile() without
wrapping
> > it in a BytesIO() object.
> 
> I agree this _should_ work but in practice it just doesn't, I spent several
> hours trying to get it to work and in the end gave up.

Ah right, ZipFile() requires a file-like object that supports seek().

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:190: with open(path, "r") as f:
Nit: mode="rb" for compatibility with Windows.

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:229: 
The logic above is duplicated in CrowdInApi.request(). Perhaps moving the common
code into another method?

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:245: if ext == ".json" and locale in required_locales:
Nit: I think we should match the file extension case-insensitive.

kzar

Patch Set 6 : Addressed more feedback from Sebastian https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py#newcode61 cms/bin/translate.py:61: ...

July 15, 2015, 9:51 a.m. (2015-07-15 09:51:25 UTC) #16

Patch Set 6 : Addressed more feedback from Sebastian

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:61: timeout=urllib3.Timeout(connect=5)
On 2015/07/14 14:39:27, Sebastian Noack wrote:
> On 2015/07/14 12:54:30, kzar wrote:
> > On 2015/07/14 11:31:07, Sebastian Noack wrote:
> > > Any particular reason you specify a custom connect timeout?
> > 
> > During testing I found that by default it didn't seem to timeout at all.  (I
> > waited for several minutes!) I set the timeout to 5 seconds here as that
> seemed
> > "about right", it made things behave the same as when we were using the
> requests
> > library instead of urllib3 directly.
> 
> But differently than urllib/urllib2 which we use everywhere else, right?
Unless
> we address a particular issue I'd rather stick to the defaults. Also I suppose
> the script might still hang on "read" anyway if only applying a connect
timeout.

Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:70: return json.loads(response.data)
On 2015/07/14 14:39:28, Sebastian Noack wrote:
> On 2015/07/14 11:31:06, Sebastian Noack wrote:
> > Note that like urllib/urllib2, urllib3's repsonse objects also have a
close()
> > method that should be called (in a finally-block).
> 
> For reference, I just realized that urllib3, magically closes the response
when
> it reaches EOF. Not sure if I like that behavior. But therefore closing the
> response isn't necessary as opposed to urllib/urllib2.

Acknowledged.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:70: return json.loads(response.data)
On 2015/07/14 14:39:27, Sebastian Noack wrote:
> On 2015/07/14 12:54:28, kzar wrote:
> > On 2015/07/14 11:31:04, Sebastian Noack wrote:
> > > How about |json.load(response)|?
> > 
> > This doesn't work as you would expect, even though by default decode_content
> > should be True[1]. I did attempt to do it this way for some time, also below
> > when we download the zip file but in the end I just couldn't get it to work
> and
> > to resort to using response.data instead.
> > 
> > [1] https://github.com/shazow/urllib3/blob/master/urllib3/response.py
> 
> I just tested it myself. And it did work. You have to specify
> preload_content=False though. Otherwise the underlying file descriptor will
> already be at EOF.

You're right, I forgot to set preload_content when testing this change and I had
wrongly assumed it was the same problem I experienced before. (I previously
spent a while trying to open the zipfile in the same way, but that time I had
remembered to set preload_content.)
Anyway you're right that this works fine with json.read.
Done.

https://codereview.adblockplus.org/29317015/diff/29322015/cms/bin/translate.p...
cms/bin/translate.py:229: with zipfile.ZipFile(io.BytesIO(response.data), "r")
as archive:
On 2015/07/14 14:39:28, Sebastian Noack wrote:
> On 2015/07/14 12:54:28, kzar wrote:
> > On 2015/07/14 11:31:04, Sebastian Noack wrote:
> > > The response is a file-like object by itself. So if you don't use the
> > > preload_content argument above, you can parse it to ZipFile() without
> wrapping
> > > it in a BytesIO() object.
> > 
> > I agree this _should_ work but in practice it just doesn't, I spent several
> > hours trying to get it to work and in the end gave up.
> 
> Ah right, ZipFile() requires a file-like object that supports seek().

Glad one of us understands :p

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.py
File cms/bin/translate.py (right):

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:190: with open(path, "r") as f:
On 2015/07/14 14:39:28, Sebastian Noack wrote:
> Nit: mode="rb" for compatibility with Windows.

Done.

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:229: 
On 2015/07/14 14:39:28, Sebastian Noack wrote:
> The logic above is duplicated in CrowdInApi.request(). Perhaps moving the
common
> code into another method?

Done.

https://codereview.adblockplus.org/29317015/diff/29322229/cms/bin/translate.p...
cms/bin/translate.py:245: if ext == ".json" and locale in required_locales:
On 2015/07/14 14:39:29, Sebastian Noack wrote:
> Nit: I think we should match the file extension case-insensitive.

Done.

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py#newcode52 cms/bin/translate.py:52: request_method, str(url), **kwargs Converting url to an str object ...

July 15, 2015, 10:31 a.m. (2015-07-15 10:31:04 UTC) #17

kzar

Patch Set 7 : Addressed even more feedback from Sebastian https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py#newcode52 ...

July 15, 2015, 11:09 a.m. (2015-07-15 11:09:03 UTC) #18

Sebastian Noack

LGTM https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322352/cms/bin/translate.py#newcode52 cms/bin/translate.py:52: request_method, str(url), **kwargs On 2015/07/15 11:09:03, kzar wrote: ...

July 15, 2015, 11:27 a.m. (2015-07-15 11:27:12 UTC) #19

Wladimir Palant

Feel free to address the nit below. LGTM if you don't. https://codereview.adblockplus.org/29317015/diff/29322374/cms/bin/translate.py File cms/bin/translate.py (right): ...

July 16, 2015, 12:17 p.m. (2015-07-16 12:17:23 UTC) #20

Sebastian Noack

https://codereview.adblockplus.org/29317015/diff/29322374/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322374/cms/bin/translate.py#newcode44 cms/bin/translate.py:44: def raw_request(self, request_method, api_endpoint, query_params, **kwargs): On 2015/07/16 12:17:23, ...

July 16, 2015, 12:23 p.m. (2015-07-16 12:23:56 UTC) #21

kzar

Patch Set 8 : Give query_params a default value https://codereview.adblockplus.org/29317015/diff/29322374/cms/bin/translate.py File cms/bin/translate.py (right): https://codereview.adblockplus.org/29317015/diff/29322374/cms/bin/translate.py#newcode44 cms/bin/translate.py:44: ...

July 16, 2015, 12:49 p.m. (2015-07-16 12:49:32 UTC) #22

Message was sent while issue was closed.

LGTM

Issue 29317015: Issue 2625 - [cms] Crowdin synchronisation script (Closed)

Description

Patch Set 1 #

Patch Set 2 : Addressed Wladimir's feedback #

Patch Set 3 : Addressed further feedback #

Patch Set 4 : Slightly simplified request exception logic #

Patch Set 5 : Addressed Sebastian's feedback #

Patch Set 6 : Addressed more feedback from Sebastian #

Patch Set 7 : Addressed even more feedback from Sebastian #

Patch Set 8 : Give query_params a default value #

Messages