|
|
DescriptionIssue 5777 - Update crowdin interface
Patch Set 1 #Patch Set 2 : Added comment #
Total comments: 11
Patch Set 3 : Outsource post / get #
Total comments: 8
Patch Set 4 : Refactoring postFiles, addressing comments #Patch Set 5 : Purged obsolete distinguishing between crowdin_{get|post} #
Total comments: 15
Patch Set 6 : Always require a json-response #
Total comments: 9
Patch Set 7 : Addressed Dave's comments #Patch Set 8 : Print actual HTTP Error Code on Exception #
Total comments: 6
Patch Set 9 : Addressed Vasily's comments #
Total comments: 1
Patch Set 10 : Belatedly addressed Sebastian's comments #
Total comments: 11
Patch Set 11 : Mimetype, PEP-8 #MessagesTotal messages: 23
Patch Set 1 Fixing Issue 5777 - Update crowdin interface * Let all Crowdin interfaces use correct endpoints Errors were: 1. the list of supported languages should now be fetched from a project-specific url (as done in cms) 2. a permanent move from api.crowdin.net to api.crowdin.com caused urllib2 to lose post-params during the redirect
Hi Tristan, I like the your refactoring. The repetition of Crowdin URLs all over the code wasn't great. See some comments below and also not that I haven't tested the code. Cheers, Vasily P.S. When I look at your changes, I feel the urge to continue the refactoring, for example to stop passing around `projectName` and `key`. It seems like this would end up in some kind of `CrowdinProject` class that could encapsulate these common parameters, the URL calculation code, urllib2 calls, JSON encoding/decoding and maybe even a bit of logic. We should probably talk later about the possibility of wrapping Crowdin logic in small library and sharing it between buildtools and cms. See also: https://github.com/epyx-src/crowdin-client (it's old and probably broken, but still could be noteworthy). https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:103: get.update({'key': key}) Why not just `get['key'] = key`? Seems simpler. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:107: query = dict(urlparse.parse_qsl(query)) Could there be anything in this query? It seems like it can only come from the `action` argument, but in all of the calls to `crowdin_url` `action` is just a simple path fragment without anything like "?foo=bar&baz=qux" that could end up in the query. Or am I missing something? https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:391: data.update({'type': 'chrome'}) Also could be just `data['type'] = 'chrome'`. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:424: postFiles(files, It would probably be more readable if this was split into two statements: url = crowdin_url(...) postFiles(files, url) What do you think? https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:430: result = urllib2.urlopen(crowdin_url(projectName, 'export', key)).read() This pattern of `urllib2.urlopen(crowdin_url(....)).read()` seems to be a common one. Maybe makes sense to make a function for it.
Patch Set 3 Addressed or commented Vasily's comments. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:103: get.update({'key': key}) On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > Why not just `get['key'] = key`? Seems simpler. Done. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:107: query = dict(urlparse.parse_qsl(query)) On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > Could there be anything in this query? It seems like it can only come from the > `action` argument, but in all of the calls to `crowdin_url` `action` is just a > simple path fragment without anything like "?foo=bar&baz=qux" that could end up > in the query. Or am I missing something? No, you are right - currently, there can't be query parameters - but i somehow don't like the idea that there shouldn't be any (e.g. when someone imports this function and expects it to keep the original query - as i did during development). In case someone absolutely opposes this, i would just drop a possibly present query - but currently i'd like to keep it. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:391: data.update({'type': 'chrome'}) On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > Also could be just `data['type'] = 'chrome'`. Done. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:424: postFiles(files, On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > It would probably be more readable if this was split into two statements: > > url = crowdin_url(...) > postFiles(files, url) > > What do you think? Done. https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:430: result = urllib2.urlopen(crowdin_url(projectName, 'export', key)).read() On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > This pattern of `urllib2.urlopen(crowdin_url(....)).read()` seems to be a common > one. Maybe makes sense to make a function for it. Yes - most of the time, the result was used to detect server side errors - outsourced this too.
Hi Tristan, I like this version even better. A couple more comments. Cheers, Vasily https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556604/localeTools.py#newc... localeTools.py:107: query = dict(urlparse.parse_qsl(query)) On 2017/09/26 12:13:13, tlucas wrote: > On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > > Could there be anything in this query? It seems like it can only come from the > > `action` argument, but in all of the calls to `crowdin_url` `action` is just a > > simple path fragment without anything like "?foo=bar&baz=qux" that could end > up > > in the query. Or am I missing something? > > No, you are right - currently, there can't be query parameters - but i somehow > don't like the idea that there shouldn't be any (e.g. when someone imports this > function and expects it to keep the original query - as i did during > development). > > In case someone absolutely opposes this, i would just drop a possibly present > query - but currently i'd like to keep it. Acknowledged. https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:128: def crowdin_get(*args, **kwargs): This get via post with None as a first argument looks kind of sloppy. I mildly prefer renaming `crowdin_post` to `crowdin_request` and then creating an additional `crowdin_post` that delegates to it (or is an alias for it). It would be functionally equivalent to the current code as far as interpreter is concerned, but would express the intention better for the human readers of the code. https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:336: allowedLocales = json.loads(crowdin_get(projectName, 'supported-languages', For extra style points you could check if there's `{'json': 1}` in parameters in `crowdin_post` and automatically `json.loads` if there is. This would make this code even leaner. Totally up to you if you want to do it? :D https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:366: result = urllib2.urlopen(request).read() This could be refactored using `crowdin_post` to be more uniform. Would make this function take a bunch of parameters, which is kind of annoying, but still worth considering. What do you think? https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:407: result = urllib2.urlopen( Isn't this `crowdin_get`?
Patch Set 4 * renaming postFiles / letting it only prepare a body / headers * json.loads() fabulousness https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:128: def crowdin_get(*args, **kwargs): On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > This get via post with None as a first argument looks kind of sloppy. I mildly > prefer renaming `crowdin_post` to `crowdin_request` and then creating an > additional `crowdin_post` that delegates to it (or is an alias for it). It would > be functionally equivalent to the current code as far as interpreter is > concerned, but would express the intention better for the human readers of the > code. Done. https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:336: allowedLocales = json.loads(crowdin_get(projectName, 'supported-languages', On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > For extra style points you could check if there's `{'json': 1}` in parameters in > `crowdin_post` and automatically `json.loads` if there is. This would make this > code even leaner. Totally up to you if you want to do it? :D Nice idea, done. https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:366: result = urllib2.urlopen(request).read() On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > This could be refactored using `crowdin_post` to be more uniform. Would make > this function take a bunch of parameters, which is kind of annoying, but still > worth considering. What do you think? I refactored this function to only prepare a body / headers and passed the result to crowdin_post - how about this? (No parameter flood) https://codereview.adblockplus.org/29556601/diff/29556633/localeTools.py#newc... localeTools.py:407: result = urllib2.urlopen( On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > Isn't this `crowdin_get`? It is, missed that :) Done.
Patch Set 5 Adjuste crowdin_{get|post}, please see in comments. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:114: def crowdin_request(project_name, action, key, get={}, post_data=None, crowdin_request can perfectly handle both get and post requests. Readability would imho not justify a crowdin_get = _crowdin_request crowdin_post = _crowdin_request
https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (left): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#oldc... localeTools.py:327: body = body.encode('utf-8') This seems unrelated, but I noticed that the encoding is not announced in the headers above. If we don't mess up the encoding here that is just because Crowdin assumes UTF-8 regardless of the headers. It would probably be better to still set the correct Content-Type including encoding. In particular if we want to become more robust against future minor changes on their end. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:106: query = dict(urlparse.parse_qsl(query)) Is there any way that there will be a query part here? If not, it seems the logic here could be as simple as: '{}?{}'.format(url, urllib.urlencode(get)) From there it can be further simplified by removing the placeholders from CROWDIN_AP_URL, in order to have only a single format operation: '{}/{}/{}?{}'.format(CROWDIN_AP_URL, project_name, action, urllib.urlencode(get)) https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:126: if raises and result.find('<success') < 0: Note that `result.find('<success') < 0` is equivalent to `'<success' in result`. But wouldn't it be sufficient/better to check the HTTP response status instead? https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:126: if raises and result.find('<success') < 0: Why is a flag necessary to opt-in for error checking? Can't/Shouldn't we just always abort if we get an erroneous response? https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:342: {'json': 1}) I wonder what would happen if we always set json=1, rather than explicitly (and redundantly) specifying it? It seems the cases where we don't we don't care about the response, and its format, anyway. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:408: data = {'titles[{}]'.format(name): re.sub(r'\.json', '', name) `re.sub(r'\.json', '', name)` is equivalent to `name.replace('.json', '')`. Either way, I wonder whether this logic is accurate, you probably only want to strip trailing occurrences, right? Does it even matter whether it is .json, or can we just strip any file extension using os.path.splitext(name)[0]?
Patch Set 1 * Always require a json response (load json only if not otherwise specified) * Rely urllib2 to raise a HTTPError in case of a failure @ crowdin * adding charset to headers https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (left): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#oldc... localeTools.py:327: body = body.encode('utf-8') On 2017/09/27 04:04:52, Sebastian Noack wrote: > This seems unrelated, but I noticed that the encoding is not announced in the > headers above. If we don't mess up the encoding here that is just because > Crowdin assumes UTF-8 regardless of the headers. It would probably be better to > still set the correct Content-Type including encoding. In particular if we want > to become more robust against future minor changes on their end. Done. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:106: query = dict(urlparse.parse_qsl(query)) On 2017/09/27 04:04:53, Sebastian Noack wrote: > Is there any way that there will be a query part here? If not, it seems the > logic here could be as simple as: > > '{}?{}'.format(url, urllib.urlencode(get)) > > From there it can be further simplified by removing the placeholders from > CROWDIN_AP_URL, in order to have only a single format operation: > > '{}/{}/{}?{}'.format(CROWDIN_AP_URL, project_name, action, > urllib.urlencode(get)) Discussion with Vasily: On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > On 2017/09/26 12:13:13, tlucas wrote: > > On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > > > Could there be anything in this query? It seems like it can only come from > the > > > `action` argument, but in all of the calls to `crowdin_url` `action` is just > a > > > simple path fragment without anything like "?foo=bar&baz=qux" that could end > > up > > > in the query. Or am I missing something? > > > > No, you are right - currently, there can't be query parameters - but i somehow > > don't like the idea that there shouldn't be any (e.g. when someone imports > this > > function and expects it to keep the original query - as i did during > > development). > > > > In case someone absolutely opposes this, i would just drop a possibly present > > query - but currently i'd like to keep it. > > Acknowledged. Summary: it could be simplified, yes - but i somehow don't like the idea that there shouldn't be an initial query (which e.g. helped a lot during development while importing this function) https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:126: if raises and result.find('<success') < 0: On 2017/09/27 04:04:52, Sebastian Noack wrote: > Note that `result.find('<success') < 0` is equivalent to `'<success' in result`. > > But wouldn't it be sufficient/better to check the HTTP response status instead? On 2017/09/27 04:04:53, Sebastian Noack wrote: > Why is a flag necessary to opt-in for error checking? Can't/Shouldn't we just > always abort if we get an erroneous response? After further investigation, it seems like relying on urllib2 to raise a HTTPError is indeed sufficient - Done. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:342: {'json': 1}) On 2017/09/27 04:04:53, Sebastian Noack wrote: > I wonder what would happen if we always set json=1, rather than explicitly (and > redundantly) specifying it? It seems the cases where we don't we don't care > about the response, and its format, anyway. There would be only one case where we explicitly expect a non-json response (see last comment in this file). Requiring the response to be json seems to only apply to this case when an error occurs, so we're safe to always require json - done. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:408: data = {'titles[{}]'.format(name): re.sub(r'\.json', '', name) On 2017/09/27 04:04:53, Sebastian Noack wrote: > `re.sub(r'\.json', '', name)` is equivalent to `name.replace('.json', '')`. > > Either way, I wonder whether this logic is accurate, you probably only want to > strip trailing occurrences, right? Does it even matter whether it is .json, or > can we just strip any file extension using os.path.splitext(name)[0]? I honestly don't know what Wladimir's intention was for this logic (he is the original author, see git commit 059de001 or hg changeset 208:26cd8d5e8c78). I can only assume the following: In the lines above this, the code checks whether the given file's extension is .json - and converts it to .json if not. I guess we wanted the title of a file to be the filename without .json with a note of it's original format (if it was anything other than json) Regarding the equivalence: yes, i merely change the comprehension to a dict comprehension, without changing the inner logic. I wouldn't mind if you want me to use .replace(). https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:453: result = crowdin_request(projectName, 'download/all.zip', key) This will be the only call expecting the response to be raw data.
Hey guys, please note the typo in the last mail: it's supposed to be "Patch Set 6", not "Patch Set 1".
Some of these changes look kind of unrelated, but I guess there's no harm improving things. Perhaps we should add tests too though, in case we changed some behaviour by mistake? https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:99: def crowdin_url(project_name, action, key, get={}): IIRC default parameter values in Python are shared, so you generally shouldn't mutate them or use them when they might be mutated. (Same below.) https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:128: raise Exception( Why catch this and raise it again, does it make the output clearer? https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:372: 'Content-Type': 'multipart/form-data; ; charset=utf-8; boundary=' + boundary, This line seems too long, does this code pass linting? https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:449: # let crowdin build the project Maybe this should be a docstring for the function instead?
Patch Set 7: * Added a docstring to getTranslations * addressed PEP-8 error Thanks for the review Dave! On 2017/09/28 09:48:30, kzar wrote: > Some of these changes look kind of unrelated, but I guess there's no harm > improving things. Perhaps we should add tests too though, in case we changed > some behaviour by mistake? I agree on the need for tests - but this would be non-trivial (and imho exceed the scope of this P1). However the test for the crowdin related build.py commands will / should be added with https://issues.adblockplus.org/ticket/5483 anyway. https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:99: def crowdin_url(project_name, action, key, get={}): On 2017/09/28 09:48:29, kzar wrote: > IIRC default parameter values in Python are shared, so you generally shouldn't > mutate them or use them when they might be mutated. (Same below.) As discussed in IRC: You are correct, but the implied problems can't occur since 'key' and 'json' get written explicitly everytime. Won't change unless someone insists. https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:128: raise Exception( On 2017/09/28 09:48:29, kzar wrote: > Why catch this and raise it again, does it make the output clearer? It does - e.g. if urllib2 raised an HTTPError(400) this would be the only output: File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) urllib2.HTTPError: HTTP Error 400: Bad Request compared to catching and raising again: Exception: Server indicated that the operation was not successful { "success":false, "error":{ "code":13, "message":"Language \"foo\" is invalid or not supported" } } https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:372: 'Content-Type': 'multipart/form-data; ; charset=utf-8; boundary=' + boundary, On 2017/09/28 09:48:29, kzar wrote: > This line seems too long, does this code pass linting? It's about time to get rid of tox.ini 'ignore' entries... Done. https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:449: # let crowdin build the project On 2017/09/28 09:48:30, kzar wrote: > Maybe this should be a docstring for the function instead? Done.
https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... localeTools.py:128: raise Exception( On 2017/09/28 10:48:40, tlucas wrote: > On 2017/09/28 09:48:29, kzar wrote: > > Why catch this and raise it again, does it make the output clearer? > > It does - e.g. if urllib2 raised an HTTPError(400) this would be the only > output: > > File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default > raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) > urllib2.HTTPError: HTTP Error 400: Bad Request > > compared to catching and raising again: > > Exception: Server indicated that the operation was not successful > { > "success":false, > "error":{ > "code":13, > "message":"Language \"foo\" is invalid or not supported" > } > } Cool fair enough. Mind making it display the status code too though? Knowing it was a HTTP 400 error is actually pretty useful.
Patch Set 8 * Print actual HTTP Error On 2017/09/28 11:34:44, kzar wrote: > https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py > File localeTools.py (right): > > https://codereview.adblockplus.org/29556601/diff/29557586/localeTools.py#newc... > localeTools.py:128: raise Exception( > On 2017/09/28 10:48:40, tlucas wrote: > > On 2017/09/28 09:48:29, kzar wrote: > > > Why catch this and raise it again, does it make the output clearer? > > > > It does - e.g. if urllib2 raised an HTTPError(400) this would be the only > > output: > > > > File "/usr/lib/python2.7/urllib2.py", line 556, in http_error_default > > raise HTTPError(req.get_full_url(), code, msg, hdrs, fp) > > urllib2.HTTPError: HTTP Error 400: Bad Request > > > > compared to catching and raising again: > > > > Exception: Server indicated that the operation was not successful > > { > > "success":false, > > "error":{ > > "code":13, > > "message":"Language \"foo\" is invalid or not supported" > > } > > } > > Cool fair enough. Mind making it display the status code too though? Knowing it > was a HTTP 400 error is actually pretty useful. Good idea - done.
LGTM
Hi Tristan, A couple more nits (see below). Cheers, Vasily https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:107: query = dict(urlparse.parse_qsl(query)) Sorry that I'm only mentioning this now, but is there any reason to use `dict(urlparse.parse_qsl(query))` instead of `urlparse.parse_qs(query)`? It seems that there would only be a difference in cases with multiple values for the same query parameter and arguably `parse_qs` behaves better in such case (it doesn't discard all values except for the last, but makes lists of values instead). https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:354: def crowdin_body_headers(files): This function name is a bit unclear, probably `crowdin_body_and_headers` would be better. Or even better and more "in the language of the domain" we could call it something like `crowdin_prepare_upload`. https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:355: """Create a post body and according headers, which Crowdin can handle.""" I think "matching" instead of "according" would be better here.
Patch Set 9 * Applied Vasily's comments https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:107: query = dict(urlparse.parse_qsl(query)) On 2017/09/28 14:35:00, Vasily Kuznetsov wrote: > Sorry that I'm only mentioning this now, but is there any reason to use > `dict(urlparse.parse_qsl(query))` instead of `urlparse.parse_qs(query)`? It > seems that there would only be a difference in cases with multiple values for > the same query parameter and arguably `parse_qs` behaves better in such case (it > doesn't discard all values except for the last, but makes lists of values > instead). You are right - Done. https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:354: def crowdin_body_headers(files): On 2017/09/28 14:35:00, Vasily Kuznetsov wrote: > This function name is a bit unclear, probably `crowdin_body_and_headers` would > be better. Or even better and more "in the language of the domain" we could call > it something like `crowdin_prepare_upload`. Fair enough - Done. https://codereview.adblockplus.org/29556601/diff/29558616/localeTools.py#newc... localeTools.py:355: """Create a post body and according headers, which Crowdin can handle.""" On 2017/09/28 14:35:00, Vasily Kuznetsov wrote: > I think "matching" instead of "according" would be better here. Done.
LGTM
https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:106: query = dict(urlparse.parse_qsl(query)) On 2017/09/27 10:51:31, tlucas wrote: > On 2017/09/27 04:04:53, Sebastian Noack wrote: > > Is there any way that there will be a query part here? If not, it seems the > > logic here could be as simple as: > > > > '{}?{}'.format(url, urllib.urlencode(get)) > > > > From there it can be further simplified by removing the placeholders from > > CROWDIN_AP_URL, in order to have only a single format operation: > > > > '{}/{}/{}?{}'.format(CROWDIN_AP_URL, project_name, action, > > urllib.urlencode(get)) > > Discussion with Vasily: > > On 2017/09/26 13:11:36, Vasily Kuznetsov wrote: > > On 2017/09/26 12:13:13, tlucas wrote: > > > On 2017/09/26 11:05:52, Vasily Kuznetsov wrote: > > > > Could there be anything in this query? It seems like it can only come from > > the > > > > `action` argument, but in all of the calls to `crowdin_url` `action` is > just > > a > > > > simple path fragment without anything like "?foo=bar&baz=qux" that could > end > > > up > > > > in the query. Or am I missing something? > > > > > > No, you are right - currently, there can't be query parameters - but i > somehow > > > don't like the idea that there shouldn't be any (e.g. when someone imports > > this > > > function and expects it to keep the original query - as i did during > > > development). > > > > > > In case someone absolutely opposes this, i would just drop a possibly > present > > > query - but currently i'd like to keep it. > > > > Acknowledged. > > Summary: it could be simplified, yes - but i somehow don't like the idea that > there shouldn't be an initial query (which e.g. helped a lot during development > while importing this function) So it seems Vasily agrees with me. Anyway, if you want this code to be more robust against potential false assumptions, you should rather escape "project_name" and "action", which is both simpler and more accurate than supporting query parameters injected in these arguments: CROWDIN_AP_URL = 'https://api.crowdin.com/api/project' '{}/{}/{}?{}'.format(CROWDIN_AP_URL, urllib.quote(project_name), urllib.quote(action), urllib.urlencode(dict(get, key=key, json=1)) As for debugging, it shouldn't make a difference where in the code you add other parameters temporarily, as long as you only need to do it in one place. Also note, that in the suggested code above I used `dict(get, key=key, json=1)` instead of modifying the passed in dictionary in place. This makes sense here, because if `get` isn't a literal expression in the calling code but an object that is reused, you wouldn't expect this function call to modify it. https://codereview.adblockplus.org/29556601/diff/29556656/localeTools.py#newc... localeTools.py:408: data = {'titles[{}]'.format(name): re.sub(r'\.json', '', name) On 2017/09/27 10:51:31, tlucas wrote: > On 2017/09/27 04:04:53, Sebastian Noack wrote: > > `re.sub(r'\.json', '', name)` is equivalent to `name.replace('.json', '')`. > > > > Either way, I wonder whether this logic is accurate, you probably only want to > > strip trailing occurrences, right? Does it even matter whether it is .json, or > > can we just strip any file extension using os.path.splitext(name)[0]? > > I honestly don't know what Wladimir's intention was for this logic (he is the > original author, see git commit 059de001 or hg changeset 208:26cd8d5e8c78). I > can only assume the following: > > In the lines above this, the code checks whether the given file's extension is > .json - and converts it to .json if not. I guess we wanted the title of a file > to be the filename without .json with a note of it's original format (if it was > anything other than json) > > Regarding the equivalence: yes, i merely change the comprehension to a dict > comprehension, without changing the inner logic. I wouldn't mind if you want me > to use .replace(). Yes, if removing each occurrence of the sub-string ".json" at any position is what is intended here, we should just use .replace() instead of a regular expression. But this certainly has been a mistake in the first place. If any file extension should be removed we should use: os.path.splitext(name)[0] If the file extension should only be removed if it is ".json", we should use: re.sub(r'\.json$', name) Since we only have json files here anyway, the result is the same, so I might lend towards the more generic approach (i.e. os.path.splitext). https://codereview.adblockplus.org/29556601/diff/29558635/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558635/localeTools.py#newc... localeTools.py:370: 'Content-Type': ('multipart/form-data; ; charset=utf-8; ' This seems incorrect, multipart/form-data doesn't have a charset, the charset should be on the subtypes of the individual files in there.
Patch Set 10 * Applied Sebastian's comments * Fixed an error where the query was accidentally overwritten (and hence not passed at all) https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:101: return '{}/{}/{}?{}'.format(CROWDIN_AP_URL, as discussed: simpler building of the desired url https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:354: 'Content-Type: application/octet-stream; charset=utf-8\r\n' as discussed: charset in each file's individual Content-Type https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:399: query = {'titles[{}]'.format(name): os.path.splitext(name)[0] os.path.splitext as discussed https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:403: crowdin_request(projectName, 'add-file', key, query, post_data=data, Hasn't benn noticed before: the query was overwritten by the post data.
https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:101: return '{}/{}/{}?{}'.format(CROWDIN_AP_URL, On 2017/09/28 21:33:58, tlucas wrote: > as discussed: simpler building of the desired url Is this even worth a separate function now? I would rather just inline this expression in crowdin_request(). https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:353: 'filename="{name}"\r\n' Nit: You can avoid wrapping before the end of line (as well as wrapping the format argument) if you format this code like that: body += ( '...' '...' ).format(...) https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:354: 'Content-Type: application/octet-stream; charset=utf-8\r\n' On 2017/09/28 21:33:58, tlucas wrote: > as discussed: charset in each file's individual Content-Type This should be "application/json", I guess?
https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:354: 'Content-Type: application/octet-stream; charset=utf-8\r\n' On 2017/09/28 22:18:37, Sebastian Noack wrote: > On 2017/09/28 21:33:58, tlucas wrote: > > as discussed: charset in each file's individual Content-Type > > This should be "application/json", I guess? Or even better, you could dynamically detect the type based on the file extension using mimetypes.guess_type(). It will always be JSON anyway, but it if we don't have to hardcode it that is better.
Patch Set 11 * removed crowdin_url() * guessing mimetype instead of hardcoding https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py File localeTools.py (right): https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:101: return '{}/{}/{}?{}'.format(CROWDIN_AP_URL, On 2017/09/28 22:18:37, Sebastian Noack wrote: > On 2017/09/28 21:33:58, tlucas wrote: > > as discussed: simpler building of the desired url > > Is this even worth a separate function now? I would rather just inline this > expression in crowdin_request(). I agree, no extra function necessary - Done. https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:353: 'filename="{name}"\r\n' On 2017/09/28 22:18:37, Sebastian Noack wrote: > Nit: You can avoid wrapping before the end of line (as well as wrapping the > format argument) if you format this code like that: > > body += ( > '...' > '...' > ).format(...) Done. https://codereview.adblockplus.org/29556601/diff/29558680/localeTools.py#newc... localeTools.py:354: 'Content-Type: application/octet-stream; charset=utf-8\r\n' On 2017/09/28 22:51:55, Sebastian Noack wrote: > On 2017/09/28 22:18:37, Sebastian Noack wrote: > > On 2017/09/28 21:33:58, tlucas wrote: > > > as discussed: charset in each file's individual Content-Type > > > > This should be "application/json", I guess? > > Or even better, you could dynamically detect the type based on the file > extension using mimetypes.guess_type(). It will always be JSON anyway, but it if > we don't have to hardcode it that is better. Good point - Done.
LGTM |