Issue 29459580: Issue 5250 - Add copyright update script

rosie

June 8, 2017, 1:39 p.m. (2017-06-08 13:39:27 UTC) #1

Jon Sonesen

Hey Rosie, Thanks for the patch! It looks like the tests are not passing as ...

June 13, 2017, 4:30 p.m. (2017-06-13 16:30:20 UTC) #2

Vasily Kuznetsov

Hi Rosie, Definitely on the right track but there are quite a few small things ...

June 19, 2017, 4:25 p.m. (2017-06-19 16:25:31 UTC) #3

rosie

Thanks for the feedback! It was very helpful. I think all comments have been addressed. ...

June 23, 2017, 2:39 p.m. (2017-06-23 14:39:26 UTC) #4

Thanks for the feedback!  It was very helpful. I think all comments have been
addressed. Please let me know if I've missed anything!

https://codereview.adblockplus.org/29459580/diff/29459581/.hgignore
File .hgignore (right):

https://codereview.adblockplus.org/29459580/diff/29459581/.hgignore#newcode9
.hgignore:9: .cache
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> Also you should add .coverage to this ignore file

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/RE...
File update-copyright/README.md (right):

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/RE...
update-copyright/README.md:11: The tests can be run via
[Tox](http://tox.readthedocs.org/)
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> I think you can pass the url to the webpage to be scraped into the script so
> maybe this should be updated to reflect that?

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
File update-copyright/tests/data/tst.patch (right):

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/data/tst.patch:2: # User Rose Howell
<r.howell@adblockplus.org>
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> If you wanted to you could change the username here to a placeholder, but I am
> leaving it up to you

Ended up not needing this file at all, but good to know for next time.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/data/tst.patch:2: # User Rose Howell
<r.howell@adblockplus.org>
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> If you wanted to you could change the username here to a placeholder, but I am
> leaving it up to you

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
File update-copyright/tests/test_update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:7: import pytest
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> PEP8 suggests that the order of imports should be stdlib, third party, local.
> Pytest is third party, so it should be in a separate section by itself.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:10: from
update_copyright.update_copyright import extract_urls
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> According to PEP8, it would be ok to put all imports from update_copyright on
> the same line. Here it would also remove quite a bit of repetition, so it
seems
> better to me.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:16: local_dir =
os.path.dirname(os.path.abspath(__file__))
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> This is the path to the directory that contains the tests. If we also accept
> Jon's suggestion to use `_path` instead of `_dir` (I'm not too strong on this,
> but since below we also have urls, having `foo_bar_path` vs. `foo_bar_url`
seems
> not too bad), we could call this `tests_path`.

Good idea. I would change it, but this variable ended up not being necessary.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:16: local_dir =
os.path.dirname(os.path.abspath(__file__))
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> This is the path to the directory that contains the tests. If we also accept
> Jon's suggestion to use `_path` instead of `_dir` (I'm not too strong on this,
> but since below we also have urls, having `foo_bar_path` vs. `foo_bar_url`
seems
> not too bad), we could call this `tests_path`.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:17: data_dir =
os.path.join(local_dir, 'data')
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> Then this would be `data_path`.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:18: full_data_dir =
urllib.parse.urljoin('file://localhost', data_dir)
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> This one is a URL of the data directory, so we can call it `data_url`.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:19: hg_page =
os.path.join(full_data_dir, 'hg_page.html')
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> This one would be `hg_page_url` then.

Ended up not needing this variable either

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:19: hg_page =
os.path.join(full_data_dir, 'hg_page.html')
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> This one would be `hg_page_url` then.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:21: os.path.join(full_data_dir +
'/repo_2/')]
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> I don't think all these constants are required, such as 'local_dir'. Also, the
> url_list could probably just can just be defined in the test which uses it. I
> would also probably change the names use 'path' rather than 'dir' since in
> python a directory can be a separate object than a path, which is the string
> that leads to the dir or file. Although I dont have the strongest opinion on
> this but i was thinking it may make the code more clear.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:26: # Returns a temporary repo
containing one sample file
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> This should be a docstring. Normally docstrings describe what a
> function/method/class does from a caller perspective, and comments document
> implementation details that are relevant for someone who would change the code
> that's being commented.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:30: subprocess.check_call(['cp',
os.path.join(data_dir, 'sample_file.py'),
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> It's better (faster) to use `shutil.copy` (see
> https://docs.python.org/3.5/library/shutil.html#shutil.copy)

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:32: subprocess.check_call(['hg',
'add', os.path.join(temp_repo,
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> You can combine this with the next call of `hg` by giving it an
`-A/--addremove`
> option.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:40: def base_dir(tmpdir):
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> As we've discussed, it would be better to refactor this fixture and the one
> above to not duplicate the repository creation logic. The repository creation
> could be moved out into a function, something like `create_repo(path)` and
then
> it would be called once in `temp_repo` fixture and twice in `base_dir`
fixture.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:41: # Returns a temporary
directory that contains one html page and two
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> This should also be a docstring.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:50:
subprocess.check_call(['mkdir', repo_2])
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> you can avoid the use of temporary variables here by using the tmpdir fixture
to
> it's full potential. You can create any number of temporary directories using
> the tmpdir object, so calling subprocess to create these repo's is not needed
,
> just call 'tmpdir.mkdir(repo1)' and it will create a directory and return a
> local path object which you can also make directories in etc, also it makes
more
> sense to just call the typecasting on the object when you need it rather than
> create temporary variables that redundantly hold strings which you already
have
> access to. 

Setting up the temporary directory base_dir like this makes it cleaner later
when we call main() in test_all(). Then we just pass the path to the base
directory, and the name hg_page.html into main(), and the script can find the hg
page, both the repos to pull and modify, and also push the changes back to the
base_dir's sub-repos. I think it's also closer to running the script on real
repos if it's set up like this, and closer to being a full integration test (one
repo has push access, the other does not):

base_dir
  - hg_page.html
  - repo_1
    - .hg
    - sample_file.py
  - repo_2
    - .hg
    - sample_file.py

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:67: return str(temp_dir)
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> returning the actual path object may be preferable here since it has a join
> method which will cut down using os.path.join, also even though youll have to
> use strpath when using subprocess youo have more control over the directory
> without calling subprocess as much.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:71: assert url_list ==
extract_urls(hg_page)
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> here is where you would just hand code the url list.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:76: text_replace(temp_repo,
filename)
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> Have you tested that this is working? it seems that you are giving a path
which
> is not expected here.

In the temp_repo fixture, 'sample_file.py' is copied into the temp_repo
directory from the /data folder. This test seems to be working for me. Is it
giving you an error?

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:80: except UnicodeDecodeError:
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> I think you shouldn't catch this. The test should fail if we can't read the
file
> for some reason.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:92:
subprocess.check_call(['touch', 'temp/foo'])
On 2017/06/13 16:30:19, Jon Sonesen wrote:
> you will want to use path.join for the temp/foo here

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:100: subprocess.call(['rm',
'-r', 'temp/'])  # cleanup
On 2017/06/19 16:25:30, Vasily Kuznetsov wrote:
> It would be better to use `shutil.rmtree` instead. But there's an even better
> idea: instead of creating temporary files around the filesystem, use `tmpdir`
> fixture and create things inside of it, then you don't need to worry about
> cleanup.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:105: base_dir, 'hg_page.html')),
base_dir)
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> localhost isnt needed here, it is causing the test to fail.

Looks like I should have used 'file:///' instead of 'file://localhost'. It seems
to be working now on both Mint and Debian.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:105: base_dir, 'hg_page.html')),
base_dir)
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> localhost isnt needed here, it is causing the test to fail.

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/te...
update-copyright/tests/test_update_copyright.py:116: same =
set(file1).intersection(file2)
On 2017/06/19 16:25:29, Vasily Kuznetsov wrote:
> This seems unnecessarily complicated. Perhaps you don't really need
`tst.patch`
> and you can just check if the line 'Noissue - Updated copyright year\n' is in
> `repo_2.patch`. Would not it be the same level of checking as now?

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/to...
File update-copyright/tox.ini (right):

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/to...
update-copyright/tox.ini:2: envlist = py35
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> Can you please use py35 and 36? Similar to the tox.ini that is here: 
> https://hg.adblockplus.org/codingtools/file/tip/flake8-eyeo/tox.ini

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/to...
update-copyright/tox.ini:17: pytest --cov=update_copyright tests/
On 2017/06/13 16:30:20, Jon Sonesen wrote:
> you can add consolidate the pytest calls into one command by adding the
coverage
> command to the original pytest call
> 
> 'pytest --cov=update_copyright tests'

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/up...
File update-copyright/update_copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/up...
update-copyright/update_copyright/update_copyright.py:1: #!/usr/bin/env python3
On 2017/06/19 16:25:31, Vasily Kuznetsov wrote:
> Maybe we should move this file to `update-copyright/update_copyright.py`
instead
> and remove `update-copyright/update_copyright/__init__.py`?

Done.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/up...
update-copyright/update_copyright/update_copyright.py:20: if repo in {
On 2017/06/19 16:25:31, Vasily Kuznetsov wrote:
> This should also not be hardcoded, but it's ok for now, let's land this first
> and then deal with these special cases.

Acknowledged.

https://codereview.adblockplus.org/29459580/diff/29459581/update-copyright/up...
update-copyright/update_copyright/update_copyright.py:33:
subprocess.check_call(['hg', 'up', '--rev', 'adblockbrowser',
On 2017/06/19 16:25:31, Vasily Kuznetsov wrote:
> This should not be hardcoded, but it's ok for now.

Acknowledged.

Vasily Kuznetsov

Hi Rosie! Thanks for addressing the comments on the first patch set, it looks pretty ...

June 27, 2017, 4:18 p.m. (2017-06-27 16:18:54 UTC) #5

rosie

Thanks for the helpful feedback! Please let me know if I've missed anything. Rosie https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/.cache/v/cache/lastfailed ...

July 3, 2017, 3:33 p.m. (2017-07-03 15:33:47 UTC) #6

Thanks for the helpful feedback! Please let me know if I've missed anything.

Rosie

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/.c...
File update-copyright/.cache/v/cache/lastfailed (right):

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/.c...
update-copyright/.cache/v/cache/lastfailed:1: {}
On 2017/06/27 16:18:50, Vasily Kuznetsov wrote:
> This file should be ignored too. Probably the best approach is to ignore all
> `.cache` directories.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
File update-copyright/tests/test_update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:49: repo_1 = temp_dir +
'/repo_1'
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Any reason why you're not using `os.path.join` here? Perhaps that would be
more
> portable and consistent.

repo_1 and repo_2 were only needed as strings, so this was a bit shorter, but
you're right about switching to os.path.join being more portable and consistent.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:49: repo_1 = temp_dir +
'/repo_1'
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Any reason why you're not using `os.path.join` here? Perhaps that would be
more
> portable and consistent.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:51:
subprocess.check_call(['mkdir', repo_1])
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> You can also use `os.mkdir` for this -- simpler than calling an external
> process. Also, can this call be moved into `create_repo` function to avoid
> duplication here?

Switched it to `os.mkdir`. The `create_repo` function is used in both fixtures,
once in `temp_repo`, and twice in `base_dir`. But the `temp_repo` function
doesn't need both `repo_1` and `repo_2`, so I left that part in the fixture.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:51:
subprocess.check_call(['mkdir', repo_1])
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> You can also use `os.mkdir` for this -- simpler than calling an external
> process. Also, can this call be moved into `create_repo` function to avoid
> duplication here?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:57:
subprocess.check_call(['touch', os.path.join(repo_2, '.hg/hgrc')])
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Do we actually need to touch `hgrc` before we write into it? Wouldn't the next
> line just create the file anyway?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:66: urls =
[os.path.join(data_url + '/repo_1/'),
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Seems like we're passing only one argument to `os.path.join` here. Wouldn't it
> do nothing in this case?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/te...
update-copyright/tests/test_update_copyright.py:85: subprocess.check_call(['hg',
'clone', temp_repo.strpath, directory])
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> You could also assign `temp_repo.strpath` to a variable, like you do with
> `directory` and reduce the duplication a bit.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/to...
File update-copyright/tox.ini (right):

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/to...
update-copyright/tox.ini:2: #envlist = py{35,36}
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Forgot to uncomment? ;)

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:18: def process_repo(url, hg_upstream):
On 2017/06/27 16:18:54, Vasily Kuznetsov wrote:
> Is `hg_upstream` unused in this function? It should probably be passed to
> `hg_commit` if I follow the logic correctly.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:19: repo =
os.path.basename(os.path.normpath(url))
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> Why are we applying `os.path.normpath` to a url here? This might work
sometimes,
> but in general that function is intended for filesystem paths, not urls. I
> wonder if this won't break in a funny way some day.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:37: subprocess.call(['hg', 'up', '--rev',
'master',
On 2017/06/27 16:18:54, Vasily Kuznetsov wrote:
> Any idea why we're not doing `check_call` here just like in the other branch
of
> `if`? Seems like if `hg` fails, we should also break.

This line attempts to switch to the master branch.  If a branch called `master`
doesn't exist, then `hg up --rev master` returns a 255, but the script can keep
running whether or not there's a branch called master.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:80: if 'ssh://hg@hg.adblockplus.org/' in
hg_upstream:
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> `hg_commit` is called with the original url of the repository on line 45 of
> `process_repo` as a second parameter, so it seems like `hg_upstream` should
> already be a complete URL in it and we don't need to add `repo` to it. Or am I
> missing something?

If we're pushing to ssh://hg@hg.adblockplus.org/ servers, then we need to add
the repo name on the end. Otherwise, it will push to the URL for each of the
repos it finds on the hg_page, and we don't need to add the repo names to the
end of those.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:94: dir_path =
'https://hg.adblockplus.org/'
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> Is this used anywhere?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:112: if (self.cell == 2) and self.recordtr
is True:
On 2017/06/27 16:18:51, Vasily Kuznetsov wrote:
> The parentheses around the first operand of `and` are redundant so it would be
> more pythonic ;) to remove them.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:116: depr1 = re.search(r'\*DEPRECATED\*',
data)
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> I think it would make the code more readable and extensible if we do something
> like this instead:
> 
>     deprecated = (re.search(....) or
>                   re.search(....))
>     if not deprecated and ...:
> 

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:124: dir_path = dirname(hg_page) + '/'
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> Two points here:
> - `hg_page` is an url, could we use something other than `dirname` from
> `posixpath` (that's a bit of a misuse, I think)?
> - `dir_path` is essentially a base url, perhaps that would be a more telling
> name. What do you think?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:129: my_result = []
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> Maybe better call this `repo_urls`?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:130: for i in range(len(parser.result)):
On 2017/06/27 16:18:53, Vasily Kuznetsov wrote:
> Why not `for url in parser.result:`?

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:137: for repo in repo_list:
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> Could probably just be `for repo in extract_urls(hg_page)`.

Done.

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/up...
update-copyright/update_copyright.py:147:
default='ssh://hg@hg.adblockplus.org/',
On 2017/06/27 16:18:52, Vasily Kuznetsov wrote:
> This should probably default to `None` and then if not specified, we should
use
> the value of `--url`. Otherwise if we specify `--url` but no `--push`, the
> behavior will be rather unexpected.
> 
> It would probably also be better to name the option `--push-url` to make it
more
> clear.

Done.

Vasily Kuznetsov

Hi Rosie, It's looking good. A few more comments but nothing big that I see ...

July 3, 2017, 7:25 p.m. (2017-07-03 19:25:48 UTC) #7

rosie

https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/update_copyright.py File update-copyright/update_copyright.py (right): https://codereview.adblockplus.org/29459580/diff/29472573/update-copyright/update_copyright.py#newcode80 update-copyright/update_copyright.py:80: if 'ssh://hg@hg.adblockplus.org/' in hg_upstream: On 2017/06/27 16:18:52, Vasily Kuznetsov ...

July 4, 2017, 1:38 p.m. (2017-07-04 13:38:26 UTC) #8

Vasily Kuznetsov

Hi Rosie, LGTM from my side, but feel free to fix a couple of stylistic ...

July 4, 2017, 3:02 p.m. (2017-07-04 15:02:41 UTC) #9

rosie

https://codereview.adblockplus.org/29459580/diff/29479603/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479603/update-copyright/README.md#newcode1 update-copyright/README.md:1: # update-copyright # On 2017/07/04 15:02:40, Vasily Kuznetsov wrote: ...

July 4, 2017, 3:14 p.m. (2017-07-04 15:14:04 UTC) #10

Jon Sonesen

Hey Guys, Rose, thanks a bunch for addressing these things, and Vasily thanks a lot ...

July 4, 2017, 3:23 p.m. (2017-07-04 15:23:17 UTC) #11

Sebastian Noack

Thanks for working on this. You did a great job! For the most part, it ...

July 5, 2017, 3:46 p.m. (2017-07-05 15:46:54 UTC) #13

Thanks for working on this. You did a great job! For the most part, it looks
great, but I'm afraid that I have some additional comments. ;)

Also the title of this review (which I suppose is the same as the commit
message) seems inaccurate. While adding test is mostly what you did here, this
patch is also adding the original update_copyright.py code. Hence a better
review title (and commit message) would be something like "Issue 5250 - Added
copyright update script".

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
File update-copyright/run_update_copyright.sh (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
update-copyright/run_update_copyright.sh:1: #!/bin/sh
Is there any purpose of this wrapper script, other than providing defaults? Note
that you could also provide defaults using argsparse in update_copyright.py:

  arg_parser.add_argument('-u', '--hg-url',
                          default='https://hg.adblockplus.org/',
                          ...)
  arg_parser.add_argument('-p', '--push-url',
                          default='ssh://hg@hg.adblockplus.org/',
                          ...)

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/te...
File update-copyright/tests/data/sample_file.py (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/te...
update-copyright/tests/data/sample_file.py:13: # Copyright (C) 2006-2014 eyeo
GmbH
Perhaps we should also test for copyright information in strings (not only in
comments). This is actually a real use case which we have in adblockplusie.
Also I think we should test that non-eyeo copyright information are left alone.

There are more cases covered by the logic implemented, which we could address
with the tests, but we have the above two cases in our actual code, and
therefore it might be worth to cover them at least.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:1: #!/usr/bin/env python3
I suppose we should make this script executable (i.e. the mode above should be
755 instead of 644). Note that if the file is not executable, the shebang (i.e.
#!) here would be useless.

Once you made the script executable, you might also want to adapt the README to
instruct calling ./update_copyright.py instead of "python3 update_copyright.py",
which as a side effect removes some duplication.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:61: print("Error: Couldn't read
{}{}".format(dirpath, filename))
Failing silently (in the original code) was intended here, as this script is
expected to encounter many binary files (e.g. images). If we output an error
message every time, this will generate a huge amount of noise.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:118: re.search(r'(Deprecated)', data))
This regular expression seems incorrect. The parentheses annotate a group. So it
won't only match substrings like "(Deprecated)" but any occurrence of
"Deprecated", storing the result in the 1st group of the match (which is
discarded here). If you want to match parantheses in the input string, you have
to escape them in the pattern:

  re.search(r'\(Deprecated\)', data)

Anyway, if all you want to do is to find a fixed substring, it would be much
simpler (and less error-prone) to just use the in-operator (instead of regular
expressions):

  '*DEPRECATED*' in data or '(Deprecated)' in data

Vasily Kuznetsov

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/05 15:46:52, Sebastian Noack wrote: > Is ...

July 5, 2017, 5:38 p.m. (2017-07-05 17:38:05 UTC) #14

Sebastian Noack

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md#newcode8 update-copyright/README.md:8: If a user doesn't have permission to push to ...

July 5, 2017, 6:55 p.m. (2017-07-05 18:55:22 UTC) #15

Sebastian Noack

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/05 17:38:05, Vasily Kuznetsov wrote: > On ...

July 5, 2017, 7:08 p.m. (2017-07-05 19:08:26 UTC) #16

Vasily Kuznetsov

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/05 19:08:25, Sebastian Noack wrote: > On ...

July 5, 2017, 7:24 p.m. (2017-07-05 19:24:46 UTC) #17

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
File update-copyright/run_update_copyright.sh (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
update-copyright/run_update_copyright.sh:1: #!/bin/sh
On 2017/07/05 19:08:25, Sebastian Noack wrote:
> On 2017/07/05 17:38:05, Vasily Kuznetsov wrote:
> > On 2017/07/05 15:46:52, Sebastian Noack wrote:
> > > Is there any purpose of this wrapper script, other than providing
defaults?
> > Note
> > > that you could also provide defaults using argsparse in
update_copyright.py:
> > > 
> > >   arg_parser.add_argument('-u', '--hg-url',
> > >                           default='https://hg.adblockplus.org/',
> > >                           ...)
> > >   arg_parser.add_argument('-p', '--push-url',
> > >                           default='ssh://hg@hg.adblockplus.org/',
> > >                           ...)
> > 
> > I suggested to have a shell script instead of defaults in the python script.
> The
> > logic of dealing with the defaults in python script is somewhat unobvious
(for
> > example, think about what would happen if we run `update_copyright.py -u
> > https://hg.foo.bar/%60). Also having the script try to push stuff into our
> > production repositories when it's run without any arguments doesn't seems
like
> > the best idea.
> > 
> > The shell script separates these things, makes the script behavior simple
and
> > obvious and prevents accidental runs targeted at production.
> 
> Why do we support these arguments anyway? I guessed that these exist to be
> overridden by the tests, but apparently the tests import the script and call
> it's internal APIs directly. And given all the other eyeo-specific logic in
the
> script, it doesn't seem to be too useful to have just these URL configurable.

We support the arguments so that the script would be useful in other contexts,
because why not. The other eyeo-specific logic in the script is not great, but
we can remove it later. For now it doesn't interfere with other uses so no big
deal.

Sebastian Noack

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/05 19:24:45, Vasily Kuznetsov wrote: > On ...

July 7, 2017, 2:48 p.m. (2017-07-07 14:48:54 UTC) #18

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
File update-copyright/run_update_copyright.sh (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
update-copyright/run_update_copyright.sh:1: #!/bin/sh
On 2017/07/05 19:24:45, Vasily Kuznetsov wrote:
> On 2017/07/05 19:08:25, Sebastian Noack wrote:
> > On 2017/07/05 17:38:05, Vasily Kuznetsov wrote:
> > > On 2017/07/05 15:46:52, Sebastian Noack wrote:
> > > > Is there any purpose of this wrapper script, other than providing
> defaults?
> > > Note
> > > > that you could also provide defaults using argsparse in
> update_copyright.py:
> > > > 
> > > >   arg_parser.add_argument('-u', '--hg-url',
> > > >                           default='https://hg.adblockplus.org/',
> > > >                           ...)
> > > >   arg_parser.add_argument('-p', '--push-url',
> > > >                           default='ssh://hg@hg.adblockplus.org/',
> > > >                           ...)
> > > 
> > > I suggested to have a shell script instead of defaults in the python
script.
> > The
> > > logic of dealing with the defaults in python script is somewhat unobvious
> (for
> > > example, think about what would happen if we run `update_copyright.py -u
> > > https://hg.foo.bar/%60). Also having the script try to push stuff into our
> > > production repositories when it's run without any arguments doesn't seems
> like
> > > the best idea.
> > > 
> > > The shell script separates these things, makes the script behavior simple
> and
> > > obvious and prevents accidental runs targeted at production.
> > 
> > Why do we support these arguments anyway? I guessed that these exist to be
> > overridden by the tests, but apparently the tests import the script and call
> > it's internal APIs directly. And given all the other eyeo-specific logic in
> the
> > script, it doesn't seem to be too useful to have just these URL
configurable.
> 
> We support the arguments so that the script would be useful in other contexts,
> because why not. The other eyeo-specific logic in the script is not great, but
> we can remove it later. For now it doesn't interfere with other uses so no big
> deal.

The wrapper script makes the whole thing less self-contained and less convenient
to use. You need to have the shell script and the Python script in the same
directory, and then run the shell script from that directory (though this could
be improved). Not to mention, that while Python is cross-platform, by using
shell scripts we will require a POSIX environment.

However, I agree to your concern if we'd handle the defaults in the Python
script, that the behavior is inconsistent when -u is provided but not -p. In
order to address this, we could make -p mandatory if -u is given. But I also see
how this is slightly more complicated than using a wrapper script.

But then again, with all the other hard-coded logic, there doesn't seem to be
any use case for -u and -p in the first place. So why not just hard-coding these
URLs? This would be simple and consistent. And if we want to make this script
useful for non-eyeo environments (without having to change the code) in the
future, we can still make everything configurable in the same go, later.

Vasily Kuznetsov

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/07 14:48:53, Sebastian Noack wrote: > On ...

July 7, 2017, 3:55 p.m. (2017-07-07 15:55:27 UTC) #19

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
File update-copyright/run_update_copyright.sh (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
update-copyright/run_update_copyright.sh:1: #!/bin/sh
On 2017/07/07 14:48:53, Sebastian Noack wrote:
> On 2017/07/05 19:24:45, Vasily Kuznetsov wrote:
> > On 2017/07/05 19:08:25, Sebastian Noack wrote:
> > > On 2017/07/05 17:38:05, Vasily Kuznetsov wrote:
> > > > On 2017/07/05 15:46:52, Sebastian Noack wrote:
> > > > > Is there any purpose of this wrapper script, other than providing
> > defaults?
> > > > Note
> > > > > that you could also provide defaults using argsparse in
> > update_copyright.py:
> > > > > 
> > > > >   arg_parser.add_argument('-u', '--hg-url',
> > > > >                           default='https://hg.adblockplus.org/',
> > > > >                           ...)
> > > > >   arg_parser.add_argument('-p', '--push-url',
> > > > >                           default='ssh://hg@hg.adblockplus.org/',
> > > > >                           ...)
> > > > 
> > > > I suggested to have a shell script instead of defaults in the python
> script.
> > > The
> > > > logic of dealing with the defaults in python script is somewhat
unobvious
> > (for
> > > > example, think about what would happen if we run `update_copyright.py -u
> > > > https://hg.foo.bar/%60). Also having the script try to push stuff into
our
> > > > production repositories when it's run without any arguments doesn't
seems
> > like
> > > > the best idea.
> > > > 
> > > > The shell script separates these things, makes the script behavior
simple
> > and
> > > > obvious and prevents accidental runs targeted at production.
> > > 
> > > Why do we support these arguments anyway? I guessed that these exist to be
> > > overridden by the tests, but apparently the tests import the script and
call
> > > it's internal APIs directly. And given all the other eyeo-specific logic
in
> > the
> > > script, it doesn't seem to be too useful to have just these URL
> configurable.
> > 
> > We support the arguments so that the script would be useful in other
contexts,
> > because why not. The other eyeo-specific logic in the script is not great,
but
> > we can remove it later. For now it doesn't interfere with other uses so no
big
> > deal.
> 
> The wrapper script makes the whole thing less self-contained and less
convenient
> to use. You need to have the shell script and the Python script in the same
> directory, and then run the shell script from that directory (though this
could
> be improved). Not to mention, that while Python is cross-platform, by using
> shell scripts we will require a POSIX environment.
> 
> However, I agree to your concern if we'd handle the defaults in the Python
> script, that the behavior is inconsistent when -u is provided but not -p. In
> order to address this, we could make -p mandatory if -u is given. But I also
see
> how this is slightly more complicated than using a wrapper script.
> 
> But then again, with all the other hard-coded logic, there doesn't seem to be
> any use case for -u and -p in the first place. So why not just hard-coding
these
> URLs? This would be simple and consistent. And if we want to make this script
> useful for non-eyeo environments (without having to change the code) in the
> future, we can still make everything configurable in the same go, later.

I think we've been discussing this more than the issue deserves. While in
principle I think that it's much better to have utilities and API calls, that do
one simple thing without mixing levels of abstraction (sort of
Unix-philosophy-like approach) and we would definitely benefit from doing more
of it, especially in sitescripts repository, this particular script is probably
glue code enough that the efforts to separate it into generic and eyeo-specific
parts are misplaced.

So how about this: we remove the shell script but leave the options with eyeo
defaults and make -p mandatory if -u is provided. The options could be useful
for testing and if -p is mandatory, things should not get too confusing for the
users.

rosie

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md#newcode8 update-copyright/README.md:8: If a user doesn't have permission to push to ...

July 7, 2017, 3:55 p.m. (2017-07-07 15:55:50 UTC) #20

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/RE...
File update-copyright/README.md (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/RE...
update-copyright/README.md:8: If a user doesn't have permission to push to a
repo, the script will make a local `repo-name.patch` file to submit later.
On 2017/07/05 18:55:21, Sebastian Noack wrote:
> We always wrap code before 80 characters. While technically, this isn't code
> here, the same reasoning applies, i.e. avoiding vertical scrolling and
improving
> the reading flow when looking at the raw text.

Done.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
File update-copyright/run_update_copyright.sh (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/ru...
update-copyright/run_update_copyright.sh:1: #!/bin/sh
On 2017/07/07 14:48:53, Sebastian Noack wrote:
> On 2017/07/05 19:24:45, Vasily Kuznetsov wrote:
> > On 2017/07/05 19:08:25, Sebastian Noack wrote:
> > > On 2017/07/05 17:38:05, Vasily Kuznetsov wrote:
> > > > On 2017/07/05 15:46:52, Sebastian Noack wrote:
> > > > > Is there any purpose of this wrapper script, other than providing
> > defaults?
> > > > Note
> > > > > that you could also provide defaults using argsparse in
> > update_copyright.py:
> > > > > 
> > > > >   arg_parser.add_argument('-u', '--hg-url',
> > > > >                           default='https://hg.adblockplus.org/',
> > > > >                           ...)
> > > > >   arg_parser.add_argument('-p', '--push-url',
> > > > >                           default='ssh://hg@hg.adblockplus.org/',
> > > > >                           ...)
> > > > 
> > > > I suggested to have a shell script instead of defaults in the python
> script.
> > > The
> > > > logic of dealing with the defaults in python script is somewhat
unobvious
> > (for
> > > > example, think about what would happen if we run `update_copyright.py -u
> > > > https://hg.foo.bar/%60). Also having the script try to push stuff into
our
> > > > production repositories when it's run without any arguments doesn't
seems
> > like
> > > > the best idea.
> > > > 
> > > > The shell script separates these things, makes the script behavior
simple
> > and
> > > > obvious and prevents accidental runs targeted at production.
> > > 
> > > Why do we support these arguments anyway? I guessed that these exist to be
> > > overridden by the tests, but apparently the tests import the script and
call
> > > it's internal APIs directly. And given all the other eyeo-specific logic
in
> > the
> > > script, it doesn't seem to be too useful to have just these URL
> configurable.
> > 
> > We support the arguments so that the script would be useful in other
contexts,
> > because why not. The other eyeo-specific logic in the script is not great,
but
> > we can remove it later. For now it doesn't interfere with other uses so no
big
> > deal.
> 
> The wrapper script makes the whole thing less self-contained and less
convenient
> to use. You need to have the shell script and the Python script in the same
> directory, and then run the shell script from that directory (though this
could
> be improved). Not to mention, that while Python is cross-platform, by using
> shell scripts we will require a POSIX environment.
> 
> However, I agree to your concern if we'd handle the defaults in the Python
> script, that the behavior is inconsistent when -u is provided but not -p. In
> order to address this, we could make -p mandatory if -u is given. But I also
see
> how this is slightly more complicated than using a wrapper script.
> 
> But then again, with all the other hard-coded logic, there doesn't seem to be
> any use case for -u and -p in the first place. So why not just hard-coding
these
> URLs? This would be simple and consistent. And if we want to make this script
> useful for non-eyeo environments (without having to change the code) in the
> future, we can still make everything configurable in the same go, later.

I don't have a strong opinion either way. Seems like this will only be used once
per year, so I'd go for whichever is more straight-forward and easier to use.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/te...
File update-copyright/tests/data/sample_file.py (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/te...
update-copyright/tests/data/sample_file.py:13: # Copyright (C) 2006-2014 eyeo
GmbH
On 2017/07/05 15:46:53, Sebastian Noack wrote:
> Perhaps we should also test for copyright information in strings (not only in
> comments). This is actually a real use case which we have in adblockplusie.
> Also I think we should test that non-eyeo copyright information are left
alone.
> 
> There are more cases covered by the logic implemented, which we could address
> with the tests, but we have the above two cases in our actual code, and
> therefore it might be worth to cover them at least.

Done.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/to...
File update-copyright/tox.ini (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/to...
update-copyright/tox.ini:11: flake8-putty
On 2017/07/05 18:55:21, Sebastian Noack wrote:
> You don't seem to actually use flake8-putty here. You probably have copied
this
> over from the tox.ini in sitescripts, buildtools or cms, where we use
> flake8-putty in order to ignore specific flake8 errors for specific files, in
> order to deal with legacy code. This isn't the case here. However, you should
> add pep8-naming and flake8-eyeo instead, as we use these flake8 extensions
> everywhere. For example see this tox.ini:
> https://hg.adblockplus.org/codingtools/file/d95b8d1f3104/patchconv/tox.ini#l9

Done.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:1: #!/usr/bin/env python3
On 2017/07/05 15:46:53, Sebastian Noack wrote:
> I suppose we should make this script executable (i.e. the mode above should be
> 755 instead of 644). Note that if the file is not executable, the shebang
(i.e.
> #!) here would be useless.
> 
> Once you made the script executable, you might also want to adapt the README
to
> instruct calling ./update_copyright.py instead of "python3
update_copyright.py",
> which as a side effect removes some duplication.

Done.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:61: print("Error: Couldn't read
{}{}".format(dirpath, filename))
On 2017/07/05 15:46:53, Sebastian Noack wrote:
> Failing silently (in the original code) was intended here, as this script is
> expected to encounter many binary files (e.g. images). If we output an error
> message every time, this will generate a huge amount of noise.

Done.

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/up...
update-copyright/update_copyright.py:118: re.search(r'(Deprecated)', data))
On 2017/07/05 15:46:53, Sebastian Noack wrote:
> This regular expression seems incorrect. The parentheses annotate a group. So
it
> won't only match substrings like "(Deprecated)" but any occurrence of
> "Deprecated", storing the result in the 1st group of the match (which is
> discarded here). If you want to match parantheses in the input string, you
have
> to escape them in the pattern:
> 
>   re.search(r'\(Deprecated\)', data)
> 
> Anyway, if all you want to do is to find a fixed substring, it would be much
> simpler (and less error-prone) to just use the in-operator (instead of regular
> expressions):
> 
>   '*DEPRECATED*' in data or '(Deprecated)' in data

Done.

Sebastian Noack

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh File update-copyright/run_update_copyright.sh (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/run_update_copyright.sh#newcode1 update-copyright/run_update_copyright.sh:1: #!/bin/sh On 2017/07/07 15:55:26, Vasily Kuznetsov wrote: > So ...

July 7, 2017, 4:05 p.m. (2017-07-07 16:05:28 UTC) #21

rosie

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md#newcode8 update-copyright/README.md:8: If a user doesn't have permission to push to ...

July 7, 2017, 4:44 p.m. (2017-07-07 16:44:01 UTC) #22

Sebastian Noack

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md#newcode8 update-copyright/README.md:8: If a user doesn't have permission to push to ...

July 7, 2017, 4:58 p.m. (2017-07-07 16:58:31 UTC) #23

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/RE...
File update-copyright/README.md (right):

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/RE...
update-copyright/README.md:8: If a user doesn't have permission to push to a
repo, the script will make a local `repo-name.patch` file to submit later.
On 2017/07/07 16:44:00, rosie wrote:
> On 2017/07/07 15:55:47, rosie wrote:
> > On 2017/07/05 18:55:21, Sebastian Noack wrote:
> > > We always wrap code before 80 characters. While technically, this isn't
code
> > > here, the same reasoning applies, i.e. avoiding vertical scrolling and
> > improving
> > > the reading flow when looking at the raw text.
> > 
> > Done.
> 
> There are three places where the raw text is over 80 characters in the new
> patch. Two are links that would otherwise be less than 80 characters. The
third
> is an example command. Are these ok, or should they also be wrapped?

I guess that is fine.

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
update-copyright/update_copyright.py:117: len(self.current_url) > 2):
On 2017/07/07 16:05:27, Sebastian Noack wrote:
> The indentation is off by one character here. Example from PEP-8
> (https://www.python.org/dev/peps/pep-0008/#indentation) for hanging
indentation:
> 
>   if (this_is_one_thing and
>       that_is_another_thing):
>       # Since both conditions are true, we can frobnicate.
>       do_something()
> 
> Alternatively, you could revert to use a "deprecated" variable again, in order
> to avoid wrapping here.

It seems you missed this comment.

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
update-copyright/update_copyright.py:149: 'ssh://hg@hg.adblockplus.org/'):
Perhaps it would be worth to handle the defaults manually, in order to avoid
duplication. It also seems to make the logic easier to follow (for me at least).

  arg_parser.add_argument('-u', '--hg-url',
                          help='specify which Mercurial URL site to scrape')
  arg_parser.add_argument('-p', '--push-url',
                          help='specify where to push the repository')
  args = arg_parser.parse_args()
  if args.hg_url and not args.push_url:
      arg_parser.error('If -u is provided, -p is mandatory')
  main(args.hg_url or 'https://hg.adblockplus.org/',
       args.push_url or 'ssh://hg@hg.adblockplus.org/')

What do you think?

rosie

https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md File update-copyright/README.md (right): https://codereview.adblockplus.org/29459580/diff/29479618/update-copyright/README.md#newcode8 update-copyright/README.md:8: If a user doesn't have permission to push to ...

July 17, 2017, 10:10 a.m. (2017-07-17 10:10:07 UTC) #24

Sebastian Noack

On 2017/07/05 15:46:54, Sebastian Noack wrote: > Also the title of this review (which I ...

July 17, 2017, 10:19 a.m. (2017-07-17 10:19:55 UTC) #25

On 2017/07/05 15:46:54, Sebastian Noack wrote:
> Also the title of this review (which I suppose is the same as the commit
> message) seems inaccurate. While adding test is mostly what you did here, this
> patch is also adding the original update_copyright.py code. Hence a better
> review title (and commit message) would be something like "Issue 5250 - Added
> copyright update script".

What is about this comment?

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
update-copyright/update_copyright.py:117: len(self.current_url) > 2):
On 2017/07/07 16:58:29, Sebastian Noack wrote:
> On 2017/07/07 16:05:27, Sebastian Noack wrote:
> > The indentation is off by one character here. Example from PEP-8
> > (https://www.python.org/dev/peps/pep-0008/#indentation) for hanging
> indentation:
> > 
> >   if (this_is_one_thing and
> >       that_is_another_thing):
> >       # Since both conditions are true, we can frobnicate.
> >       do_something()
> > 
> > Alternatively, you could revert to use a "deprecated" variable again, in
order
> > to avoid wrapping here.
> 
> It seems you missed this comment.

It seems this comment still hasn't been addressed.

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
update-copyright/update_copyright.py:149: 'ssh://hg@hg.adblockplus.org/'):
On 2017/07/07 16:58:31, Sebastian Noack wrote:
> Perhaps it would be worth to handle the defaults manually, in order to avoid
> duplication. It also seems to make the logic easier to follow (for me at
least).
> 
>   arg_parser.add_argument('-u', '--hg-url',
>                           help='specify which Mercurial URL site to scrape')
>   arg_parser.add_argument('-p', '--push-url',
>                           help='specify where to push the repository')
>   args = arg_parser.parse_args()
>   if args.hg_url and not args.push_url:
>       arg_parser.error('If -u is provided, -p is mandatory')
>   main(args.hg_url or 'https://hg.adblockplus.org/',
>        args.push_url or 'ssh://hg@hg.adblockplus.org/')
> 
> What do you think?

What is about this comment?

rosie

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/update_copyright.py File update-copyright/update_copyright.py (right): https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/update_copyright.py#newcode117 update-copyright/update_copyright.py:117: len(self.current_url) > 2): On 2017/07/17 10:19:53, Sebastian Noack wrote: ...

July 17, 2017, 1:23 p.m. (2017-07-17 13:23:18 UTC) #26

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
update-copyright/update_copyright.py:117: len(self.current_url) > 2):
On 2017/07/17 10:19:53, Sebastian Noack wrote:
> On 2017/07/07 16:58:29, Sebastian Noack wrote:
> > On 2017/07/07 16:05:27, Sebastian Noack wrote:
> > > The indentation is off by one character here. Example from PEP-8
> > > (https://www.python.org/dev/peps/pep-0008/#indentation) for hanging
> > indentation:
> > > 
> > >   if (this_is_one_thing and
> > >       that_is_another_thing):
> > >       # Since both conditions are true, we can frobnicate.
> > >       do_something()
> > > 
> > > Alternatively, you could revert to use a "deprecated" variable again, in
> order
> > > to avoid wrapping here.
> > 
> > It seems you missed this comment.
> 
> It seems this comment still hasn't been addressed.

I was getting a flake8 error when using the proper indentation. Instead, I added
some extra indentation on the conditional continuation line, as recommended
here: https://github.com/PyCQA/pycodestyle/issues/386

Hope that's ok?

https://codereview.adblockplus.org/29459580/diff/29482675/update-copyright/up...
update-copyright/update_copyright.py:117: len(self.current_url) > 2):
On 2017/07/07 16:05:27, Sebastian Noack wrote:
> The indentation is off by one character here. Example from PEP-8
> (https://www.python.org/dev/peps/pep-0008/#indentation) for hanging
indentation:
> 
>   if (this_is_one_thing and
>       that_is_another_thing):
>       # Since both conditions are true, we can frobnicate.
>       do_something()
> 
> Alternatively, you could revert to use a "deprecated" variable again, in order
> to avoid wrapping here.

Done.

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
File update-copyright/update_copyright.py (right):

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
update-copyright/update_copyright.py:149: 'ssh://hg@hg.adblockplus.org/'):
On 2017/07/17 10:19:54, Sebastian Noack wrote:
> On 2017/07/07 16:58:31, Sebastian Noack wrote:
> > Perhaps it would be worth to handle the defaults manually, in order to avoid
> > duplication. It also seems to make the logic easier to follow (for me at
> least).
> > 
> >   arg_parser.add_argument('-u', '--hg-url',
> >                           help='specify which Mercurial URL site to scrape')
> >   arg_parser.add_argument('-p', '--push-url',
> >                           help='specify where to push the repository')
> >   args = arg_parser.parse_args()
> >   if args.hg_url and not args.push_url:
> >       arg_parser.error('If -u is provided, -p is mandatory')
> >   main(args.hg_url or 'https://hg.adblockplus.org/',
> >        args.push_url or 'ssh://hg@hg.adblockplus.org/')
> > 
> > What do you think?
> 
> What is about this comment?

Yeah, makes sense. I like this way better.

https://codereview.adblockplus.org/29459580/diff/29482684/update-copyright/up...
update-copyright/update_copyright.py:149: 'ssh://hg@hg.adblockplus.org/'):
On 2017/07/07 16:58:31, Sebastian Noack wrote:
> Perhaps it would be worth to handle the defaults manually, in order to avoid
> duplication. It also seems to make the logic easier to follow (for me at
least).
> 
>   arg_parser.add_argument('-u', '--hg-url',
>                           help='specify which Mercurial URL site to scrape')
>   arg_parser.add_argument('-p', '--push-url',
>                           help='specify where to push the repository')
>   args = arg_parser.parse_args()
>   if args.hg_url and not args.push_url:
>       arg_parser.error('If -u is provided, -p is mandatory')
>   main(args.hg_url or 'https://hg.adblockplus.org/',
>        args.push_url or 'ssh://hg@hg.adblockplus.org/')
> 
> What do you think?

Done.

LGTM again for PS9

Issue 29459580: Issue 5250 - Add copyright update script (Closed)

Description

Patch Set 1 #

Patch Set 2 : Addressed comments #

Patch Set 3 : Addressed more comments #

Patch Set 4 : Clean up comments, ReadMe, formatting #

Patch Set 5 : Minor formatting fixes #

Patch Set 6 : Also test strings and non-eyeo copyright info #

Patch Set 7 : Remove the wrapper #

Patch Set 8 : Remove bash script, make Python script executable #

Patch Set 9 : Fix indentation and default args #

Messages