|
|
Created:
Sept. 15, 2015, 4:37 p.m. by René Jeschke Modified:
Sept. 24, 2015, 8:25 a.m. Reviewers:
Felix Dahlke Visibility:
Public. |
DescriptionIssue 3047 - Change default search engines
Patch Set 1 #Patch Set 2 : Now really writing region.properties to region.properties. #Patch Set 3 : Added a missing colon to a log message. #Patch Set 4 : Limit search.order entries to 3. #Patch Set 5 : Re-added Google/Yahoo/Amazon for China. #Patch Set 6 : More search engines for China. #Patch Set 7 : Removed a comma. #Patch Set 8 : regex enhancement, code duplication reduction. #
Total comments: 10
Patch Set 9 : Beautification #Patch Set 10 : trim()/strip() #
Total comments: 24
Patch Set 11 : Names and stuff #Patch Set 12 : Brackets #Patch Set 13 : Removed overly complicated currying, replaced with class #Patch Set 14 : That trim() was persistent, removed _again_. #
Total comments: 8
Patch Set 15 : Fixed even more review issues #
MessagesTotal messages: 9
Didn't do a full review, but there's some style stuff we should get out of the way first. General approach looks OK to me. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... File mozharness/abb/abb_transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:1: # vim:fileencoding=utf-8:et:ts=4:sts=4 We generally don't add mode line headers: https://adblockplus.org/en/coding-style#general https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:21: ABB_LOCALE_RE = re.compile("^[a-z]{2}(-[A-Z]{2})?$") There is no need to prefix anything with "abb" here, every module has its own namespace. What Python doesn't have, however, is a notion of visibility - that's why private symbols should be prefixed with a single underscore, e.g. `_LOCALE_RE`. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:32: ABB_CFG = { Looks like this could be simplified: _SEARCH_ENGINE_ORDER = { "en-US": [ "duckduckgo", ... ], "zh-CN": [ "baidu", ... ] } https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... File mozharness/mozilla/l10n/multi_locale_build.py (right): https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... mozharness/mozilla/l10n/multi_locale_build.py:22: from mozharness.abb.abb_transform_locales import abb_transform_locales_impl There is no need to suffix that function with "impl" within the abb_transform_locales module (which should be renamed to `transform_locales` - there is no need for the prefix, there are namespaces). You could achieve the same effect here with: `from mozharness.abb.transform_locales import transform_locales as abb_transform_locales_impl` Although I personally think of interfaces when hearing "impl" and would probably go for "abb_to_transform_locales" or, more likely, just import mozharness.abb.transform_locales and call the function fully qualified below. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... mozharness/mozilla/l10n/multi_locale_build.py:252: Nit: Even though PEP-8 suggests two blank lines between functions, it states there should only be a single one between methods, as Mozilla does it here, too: https://www.python.org/dev/peps/pep-0008/#blank-lines
https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... File mozharness/abb/abb_transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:1: # vim:fileencoding=utf-8:et:ts=4:sts=4 On 2015/09/17 09:52:34, Felix Dahlke wrote: > We generally don't add mode line headers: > https://adblockplus.org/en/coding-style#general Done. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:21: ABB_LOCALE_RE = re.compile("^[a-z]{2}(-[A-Z]{2})?$") On 2015/09/17 09:52:34, Felix Dahlke wrote: > There is no need to prefix anything with "abb" here, every module has its own > namespace. > > What Python doesn't have, however, is a notion of visibility - that's why > private symbols should be prefixed with a single underscore, e.g. `_LOCALE_RE`. Done. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:32: ABB_CFG = { On 2015/09/17 09:52:35, Felix Dahlke wrote: > Looks like this could be simplified: > > _SEARCH_ENGINE_ORDER = { > "en-US": [ > "duckduckgo", > ... > ], > "zh-CN": [ > "baidu", > ... > ] > } Done. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... File mozharness/mozilla/l10n/multi_locale_build.py (right): https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... mozharness/mozilla/l10n/multi_locale_build.py:22: from mozharness.abb.abb_transform_locales import abb_transform_locales_impl On 2015/09/17 09:52:35, Felix Dahlke wrote: > There is no need to suffix that function with "impl" within the > abb_transform_locales module (which should be renamed to `transform_locales` - > there is no need for the prefix, there are namespaces). > > You could achieve the same effect here with: > > `from mozharness.abb.transform_locales import transform_locales as > abb_transform_locales_impl` > > Although I personally think of interfaces when hearing "impl" and would probably > go for "abb_to_transform_locales" or, more likely, just import > mozharness.abb.transform_locales and call the function fully qualified below. Done. https://codereview.adblockplus.org/29327949/diff/29328081/mozharness/mozilla/... mozharness/mozilla/l10n/multi_locale_build.py:252: On 2015/09/17 09:52:35, Felix Dahlke wrote: > Nit: Even though PEP-8 suggests two blank lines between functions, it states > there should only be a single one between methods, as Mozilla does it here, too: > https://www.python.org/dev/peps/pep-0008/#blank-lines Done.
So, how I see it, we will adjust the en-US locale, too? Then we should revert our changes to region.properties. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... File mozharness/abb/abb_transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:1: # This file is part of Adblock Plus As I suggested before, this file should be renamed to "transform_locales.py" - there's no point in the "abb_" prefix, it's already in a module called "abb". https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:31: "en-US": ["duckduckgo", Being our file, I guess we should stick to 2 space indents here. But let's please do that once all other comments have been addressed, in a separate patch set. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:45: def _get_shortname_from_id(needle, engine_ids, engine_map): Nit: I would find "engine_names" more fitting than "engine_map" here - took some scrolling to figure out what this is supposed to do, and the function says it gets a name. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:61: def _transform_locale(locale, path, fns): Wouldn't hurt readability to split this big function up into smaller ones (as evident from the fact that it has sections preceeded by comments), but I wouldn't insist. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:65: list_file = os.path.join(path, _LIST_TXT_PATH) Nit: Being a path, not a file object, `list_path`? Likewise `region_path` and `xml_path` below. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:77: white_re = re.compile("^(%s).*$" % ("|".join(whitelist))) Nit: No need for the extra parentheses around "|".join(whitelist) https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:81: for line in open(list_file, "r"): This way, the file will stay open, you should use a with block: with open(list_path, "r") as list_file: for line in list_file: ... https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:175: def transform_locales(obj_dir, obj): I found the name "obj" pretty non-obvious here, it is in fact a MultiLocaleBuild instance, and not really related to `obj_dir` at all. Why not call it multi_locale_build or build_object or something like that? Also, being sort of a pseudo method, I would expect it to be the first parameter, where `self` always is. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:176: fns = {"info": _print_info(obj), I think this is more complex than it would have to be: `_print_info` etc. are just wrappers that can deal with `obj` being non-None. Why not just check for that once, right at the beginning of this function? If we know info, error and fatal can be called, there is no need for the wrappers, and also not for this `fns` map here - you could just pass in `obj` instead of `fns`. Alternatively, you could turn this into a class that has the build object as a member. Or, I guess, you could even turn this into a mixin for MultiBuildObject. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:182: fns["fatal"]("'dist/bin/chrome' not existent in '%s'" % obj_dir) I guess we shouldn't duplicate "dist/bin/chrome" here. You could just say: `"'%s' does not exist" % chrome_path` https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:188: locales.sort() Why sort them?
https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... File mozharness/abb/abb_transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:31: "en-US": ["duckduckgo", On 2015/09/17 14:57:56, Felix Dahlke wrote: > Being our file, I guess we should stick to 2 space indents here. But let's > please do that once all other comments have been addressed, in a separate patch > set. Sure. Wasn't aware that we're not fully pep8 :-D https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:45: def _get_shortname_from_id(needle, engine_ids, engine_map): On 2015/09/17 14:57:57, Felix Dahlke wrote: > Nit: I would find "engine_names" more fitting than "engine_map" here - took some > scrolling to figure out what this is supposed to do, and the function says it > gets a name. Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:61: def _transform_locale(locale, path, fns): On 2015/09/17 14:57:57, Felix Dahlke wrote: > Wouldn't hurt readability to split this big function up into smaller ones (as > evident from the fact that it has sections preceeded by comments), but I > wouldn't insist. Then I prefer leaving it like it is. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:65: list_file = os.path.join(path, _LIST_TXT_PATH) On 2015/09/17 14:57:57, Felix Dahlke wrote: > Nit: Being a path, not a file object, `list_path`? Likewise `region_path` and > `xml_path` below. Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:77: white_re = re.compile("^(%s).*$" % ("|".join(whitelist))) On 2015/09/17 14:57:57, Felix Dahlke wrote: > Nit: No need for the extra parentheses around "|".join(whitelist) Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:81: for line in open(list_file, "r"): On 2015/09/17 14:57:57, Felix Dahlke wrote: > This way, the file will stay open, you should use a with block: > > with open(list_path, "r") as list_file: > for line in list_file: > ... Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:175: def transform_locales(obj_dir, obj): On 2015/09/17 14:57:57, Felix Dahlke wrote: > I found the name "obj" pretty non-obvious here, it is in fact a MultiLocaleBuild > instance, and not really related to `obj_dir` at all. Why not call it > multi_locale_build or build_object or something like that? > > Also, being sort of a pseudo method, I would expect it to be the first > parameter, where `self` always is. Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:176: fns = {"info": _print_info(obj), On 2015/09/17 14:57:56, Felix Dahlke wrote: > I think this is more complex than it would have to be: > > `_print_info` etc. are just wrappers that can deal with `obj` being non-None. > Why not just check for that once, right at the beginning of this function? > > If we know info, error and fatal can be called, there is no need for the > wrappers, and also not for this `fns` map here - you could just pass in `obj` > instead of `fns`. > > Alternatively, you could turn this into a class that has the build object as a > member. Or, I guess, you could even turn this into a mixin for MultiBuildObject. I thought you wanted to be able to use this script without mozharness. Those are not simply 'wrappers', those methods use currying to hide the fact that we need an object to call those functions during a mozharness build. When this object is not available, it uses built-in functionality to replace it. If you don't care about this script being used standalone, then I just remove the closures and use 'build_object' directly. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:182: fns["fatal"]("'dist/bin/chrome' not existent in '%s'" % obj_dir) On 2015/09/17 14:57:56, Felix Dahlke wrote: > I guess we shouldn't duplicate "dist/bin/chrome" here. You could just say: > `"'%s' does not exist" % chrome_path` Done. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:188: locales.sort() On 2015/09/17 14:57:56, Felix Dahlke wrote: > Why sort them? Why not? :-) It makes the log output clearer, as you know that 'en-US' will be before 'kk' in the output ... nothing more and nothing less. If you really think, that sorting 66 locales might be a performance bottleneck, or that we do not have the 19 bytes for this line, then I'll remove it.
https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... File mozharness/abb/abb_transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:61: def _transform_locale(locale, path, fns): On 2015/09/18 09:46:38, René Jeschke wrote: > On 2015/09/17 14:57:57, Felix Dahlke wrote: > > Wouldn't hurt readability to split this big function up into smaller ones (as > > evident from the fact that it has sections preceeded by comments), but I > > wouldn't insist. > > Then I prefer leaving it like it is. Acknowledged. https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:176: fns = {"info": _print_info(obj), On 2015/09/18 09:46:37, René Jeschke wrote: > On 2015/09/17 14:57:56, Felix Dahlke wrote: > > I think this is more complex than it would have to be: > > > > `_print_info` etc. are just wrappers that can deal with `obj` being non-None. > > Why not just check for that once, right at the beginning of this function? > > > > If we know info, error and fatal can be called, there is no need for the > > wrappers, and also not for this `fns` map here - you could just pass in `obj` > > instead of `fns`. > > > > Alternatively, you could turn this into a class that has the build object as a > > member. Or, I guess, you could even turn this into a mixin for > MultiBuildObject. > > I thought you wanted to be able to use this script without mozharness. > > Those are not simply 'wrappers', those methods use currying to hide the fact > that we need an object to call those functions during a mozharness build. When > this object is not available, it uses built-in functionality to replace it. > > If you don't care about this script being used standalone, then I just remove > the closures and use 'build_object' directly. Well, I think we should either use the build object directly, or decouple it completely by passing `fns` into `transform_locales`. Either way is fine by me, I just don't like this mixed approach :) https://codereview.adblockplus.org/29327949/diff/29328159/mozharness/abb/abb_... mozharness/abb/abb_transform_locales.py:188: locales.sort() On 2015/09/18 09:46:38, René Jeschke wrote: > On 2015/09/17 14:57:56, Felix Dahlke wrote: > > Why sort them? > > Why not? :-) > > It makes the log output clearer, as you know that 'en-US' will be before 'kk' in > the output ... nothing more and nothing less. > > If you really think, that sorting 66 locales might be a performance bottleneck, > or that we do not have the 19 bytes for this line, then I'll remove it. No, no concerns, was just wondering if there's a technical reason for it that I don't understand. Comment wouldn't hurt, but I guess it's fine anyway.
Sorry, didn't realise you had actually changed the build_object thing already. Found some more things :/ https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... File mozharness/abb/transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:51: if regex.match(engine.lower()): Sorry for not realising this earlier, but wouldn't engine.startswith(needle) do the trick here? https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:87: if white_re.match(line.lower()): I think we should do proper case insensitive matching here. The semantics differ slightly from the current version, but to me it seems that we actually want a plain old case insensitive comparison here anyway. white_re = re.compile(..., re.IGNORECASE) white_re.match(line) https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:106: short_name = line[11:-12].strip() Wow, just realised the hard coded offsets - that seems a bit fragile to me. Why not just use a group to match the text content of the <ShortName> element? https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:161: def transform_locales(build_object, obj_dir): Maybe it's just me, but I think this would be more elegant: def transform_locales(obj_dir, logger=MinimalLogger()):
https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... File mozharness/abb/transform_locales.py (right): https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:51: if regex.match(engine.lower()): On 2015/09/21 20:46:31, Felix Dahlke wrote: > Sorry for not realising this earlier, but wouldn't engine.startswith(needle) do > the trick here? Done. https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:87: if white_re.match(line.lower()): On 2015/09/21 20:46:31, Felix Dahlke wrote: > I think we should do proper case insensitive matching here. The semantics differ > slightly from the current version, but to me it seems that we actually want a > plain old case insensitive comparison here anyway. > > white_re = re.compile(..., re.IGNORECASE) > white_re.match(line) I threw out the case insensitive stuff completely, doesn't really make sense, as also Mozilla does 'real matches' IIRC. https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:106: short_name = line[11:-12].strip() On 2015/09/21 20:46:31, Felix Dahlke wrote: > Wow, just realised the hard coded offsets - that seems a bit fragile to me. Why > not just use a group to match the text content of the <ShortName> element? Done. https://codereview.adblockplus.org/29327949/diff/29328193/mozharness/abb/tran... mozharness/abb/transform_locales.py:161: def transform_locales(build_object, obj_dir): On 2015/09/21 20:46:31, Felix Dahlke wrote: > Maybe it's just me, but I think this would be more elegant: > > def transform_locales(obj_dir, logger=MinimalLogger()): Done.
LGTM! |