update-copyright/tests/test_update_copyright.py - Issue 29459580: Issue 5250 - Add copyright update script

Side by Side Diff: update-copyright/tests/test_update_copyright.py

Issue 29459580: Issue 5250 - Add copyright update script (Closed) Base URL: https://hg.adblockplus.org/codingtools

Patch Set: Created June 8, 2017, 1:39 p.m.

Left:
Right:

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 #!/usr/bin/env python3

	2

	3 import os

	4 import re

	5 import datetime

	6 import subprocess

	7 import pytest
	Vasily Kuznetsov 2017/06/19 16:25:30 PEP8 suggests that the order of imports should be PEP8 suggests that the order of imports should be stdlib, third party, local. Pytest is third party, so it should be in a separate section by itself. rosie 2017/06/23 14:39:24 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > PEP8 suggests that the order of imports should be stdlib, third party, local. > Pytest is third party, so it should be in a separate section by itself. Done.
	8 import urllib.parse

	9

	10 from update_copyright.update_copyright import extract_urls
	Vasily Kuznetsov 2017/06/19 16:25:30 According to PEP8, it would be ok to put all impor According to PEP8, it would be ok to put all imports from update_copyright on the same line. Here it would also remove quite a bit of repetition, so it seems better to me. rosie 2017/06/23 14:39:21 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > According to PEP8, it would be ok to put all imports from update_copyright on > the same line. Here it would also remove quite a bit of repetition, so it seems > better to me. Done.
	11 from update_copyright.update_copyright import text_replace

	12 from update_copyright.update_copyright import hg_commit

	13 from update_copyright.update_copyright import main

	14

	15 CURRENT_YEAR = datetime.datetime.now().year

	16 local_dir = os.path.dirname(os.path.abspath(__file__))
	Vasily Kuznetsov 2017/06/19 16:25:30 This is the path to the directory that contains th This is the path to the directory that contains the tests. If we also accept Jon's suggestion to use `_path` instead of `_dir` (I'm not too strong on this, but since below we also have urls, having `foo_bar_path` vs. `foo_bar_url` seems not too bad), we could call this `tests_path`. rosie 2017/06/23 14:39:21 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > This is the path to the directory that contains the tests. If we also accept > Jon's suggestion to use `_path` instead of `_dir` (I'm not too strong on this, > but since below we also have urls, having `foo_bar_path` vs. `foo_bar_url` seems > not too bad), we could call this `tests_path`. Done. rosie 2017/06/23 14:39:21 Good idea. I would change it, but this variable en Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > This is the path to the directory that contains the tests. If we also accept > Jon's suggestion to use `_path` instead of `_dir` (I'm not too strong on this, > but since below we also have urls, having `foo_bar_path` vs. `foo_bar_url` seems > not too bad), we could call this `tests_path`. Good idea. I would change it, but this variable ended up not being necessary.
	17 data_dir = os.path.join(local_dir, 'data')
	Vasily Kuznetsov 2017/06/19 16:25:29 Then this would be `data_path`. Then this would be `data_path`. rosie 2017/06/23 14:39:21 Done. Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > Then this would be `data_path`. Done.
	18 full_data_dir = urllib.parse.urljoin('file://localhost', data_dir)
	Vasily Kuznetsov 2017/06/19 16:25:30 This one is a URL of the data directory, so we can This one is a URL of the data directory, so we can call it `data_url`. rosie 2017/06/23 14:39:24 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > This one is a URL of the data directory, so we can call it `data_url`. Done.
	19 hg_page = os.path.join(full_data_dir, 'hg_page.html')
	Vasily Kuznetsov 2017/06/19 16:25:29 This one would be `hg_page_url` then. This one would be `hg_page_url` then. rosie 2017/06/23 14:39:23 Ended up not needing this variable either Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > This one would be `hg_page_url` then. Ended up not needing this variable either rosie 2017/06/23 14:39:23 Done. Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > This one would be `hg_page_url` then. Done.
	20 url_list = [os.path.join(full_data_dir + '/repo_1/'),

	21 os.path.join(full_data_dir + '/repo_2/')]
	Jon Sonesen 2017/06/13 16:30:20 I don't think all these constants are required, su I don't think all these constants are required, such as 'local_dir'. Also, the url_list could probably just can just be defined in the test which uses it. I would also probably change the names use 'path' rather than 'dir' since in python a directory can be a separate object than a path, which is the string that leads to the dir or file. Although I dont have the strongest opinion on this but i was thinking it may make the code more clear. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/13 16:30:20, Jon Sonesen wrote: > I don't think all these constants are required, such as 'local_dir'. Also, the > url_list could probably just can just be defined in the test which uses it. I > would also probably change the names use 'path' rather than 'dir' since in > python a directory can be a separate object than a path, which is the string > that leads to the dir or file. Although I dont have the strongest opinion on > this but i was thinking it may make the code more clear. Done.
	22

	23

	24 @pytest.fixture()

	25 def temp_repo(tmpdir):

	26 # Returns a temporary repo containing one sample file
	Vasily Kuznetsov 2017/06/19 16:25:29 This should be a docstring. Normally docstrings de This should be a docstring. Normally docstrings describe what a function/method/class does from a caller perspective, and comments document implementation details that are relevant for someone who would change the code that's being commented. rosie 2017/06/23 14:39:24 Done. Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > This should be a docstring. Normally docstrings describe what a > function/method/class does from a caller perspective, and comments document > implementation details that are relevant for someone who would change the code > that's being commented. Done.
	27 temp_dir = tmpdir.mkdir('tmp_dir')

	28 temp_repo = str(temp_dir)

	29 subprocess.check_call(['hg', 'init', temp_repo])

	30 subprocess.check_call(['cp', os.path.join(data_dir, 'sample_file.py'),
	Vasily Kuznetsov 2017/06/19 16:25:30 It's better (faster) to use `shutil.copy` (see htt It's better (faster) to use `shutil.copy` (see https://docs.python.org/3.5/library/shutil.html#shutil.copy) rosie 2017/06/23 14:39:23 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > It's better (faster) to use `shutil.copy` (see > https://docs.python.org/3.5/library/shutil.html#shutil.copy) Done.
	31 temp_repo])

	32 subprocess.check_call(['hg', 'add', os.path.join(temp_repo,
	Vasily Kuznetsov 2017/06/19 16:25:30 You can combine this with the next call of `hg` by You can combine this with the next call of `hg` by giving it an `-A/--addremove` option. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > You can combine this with the next call of `hg` by giving it an `-A/--addremove` > option. Done.
	33 'sample_file.py'), '--repository', temp_repo])

	34 subprocess.check_call(['hg', 'commit', '-m', 'Initial commit',

	35 '--repository', temp_repo])

	36 return temp_repo

	37

	38

	39 @pytest.fixture()

	40 def base_dir(tmpdir):
	Vasily Kuznetsov 2017/06/19 16:25:29 As we've discussed, it would be better to refactor As we've discussed, it would be better to refactor this fixture and the one above to not duplicate the repository creation logic. The repository creation could be moved out into a function, something like `create_repo(path)` and then it would be called once in `temp_repo` fixture and twice in `base_dir` fixture. rosie 2017/06/23 14:39:23 Done. Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > As we've discussed, it would be better to refactor this fixture and the one > above to not duplicate the repository creation logic. The repository creation > could be moved out into a function, something like `create_repo(path)` and then > it would be called once in `temp_repo` fixture and twice in `base_dir` fixture. Done.
	41 # Returns a temporary directory that contains one html page and two
	Vasily Kuznetsov 2017/06/19 16:25:30 This should also be a docstring. This should also be a docstring. rosie 2017/06/23 14:39:24 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > This should also be a docstring. Done.
	42 # repositories (one with push access, the other without)

	43 tmp_repo = tmpdir.mkdir('tmp_dir')

	44 temp_dir = str(tmp_repo)

	45 subprocess.check_call(['cp', os.path.join(data_dir, 'hg_page.html'),

	46 temp_dir])

	47 repo_1 = temp_dir + '/repo_1'

	48 repo_2 = temp_dir + '/repo_2'

	49 subprocess.check_call(['mkdir', repo_1])

	50 subprocess.check_call(['mkdir', repo_2])
	Jon Sonesen 2017/06/13 16:30:19 you can avoid the use of temporary variables here you can avoid the use of temporary variables here by using the tmpdir fixture to it's full potential. You can create any number of temporary directories using the tmpdir object, so calling subprocess to create these repo's is not needed , just call 'tmpdir.mkdir(repo1)' and it will create a directory and return a local path object which you can also make directories in etc, also it makes more sense to just call the typecasting on the object when you need it rather than create temporary variables that redundantly hold strings which you already have access to. rosie 2017/06/23 14:39:24 Setting up the temporary directory base_dir like t Show quoted text On 2017/06/13 16:30:19, Jon Sonesen wrote: > you can avoid the use of temporary variables here by using the tmpdir fixture to > it's full potential. You can create any number of temporary directories using > the tmpdir object, so calling subprocess to create these repo's is not needed , > just call 'tmpdir.mkdir(repo1)' and it will create a directory and return a > local path object which you can also make directories in etc, also it makes more > sense to just call the typecasting on the object when you need it rather than > create temporary variables that redundantly hold strings which you already have > access to. Setting up the temporary directory base_dir like this makes it cleaner later when we call main() in test_all(). Then we just pass the path to the base directory, and the name hg_page.html into main(), and the script can find the hg page, both the repos to pull and modify, and also push the changes back to the base_dir's sub-repos. I think it's also closer to running the script on real repos if it's set up like this, and closer to being a full integration test (one repo has push access, the other does not): base_dir - hg_page.html - repo_1 - .hg - sample_file.py - repo_2 - .hg - sample_file.py
	51 subprocess.check_call(['cp', os.path.join(data_dir, 'sample_file.py'),

	52 repo_1])

	53 subprocess.check_call(['cp', os.path.join(data_dir, 'sample_file.py'),

	54 repo_2])

	55 subprocess.check_call(['hg', 'init', repo_1])

	56 subprocess.check_call(['hg', 'init', repo_2])

	57 subprocess.check_call(['hg', 'commit', '-Am', '"Initial commit"',

	58 '--repository', repo_1])

	59 subprocess.check_call(['hg', 'commit', '-Am', '"Initial commit"',

	60 '--repository', repo_2])

	61

	62 # Make repo_2 read-only

	63 subprocess.check_call(['touch', os.path.join(repo_2, '.hg/hgrc')])

	64 with open(os.path.join(repo_2, '.hg/hgrc'), 'w') as hgrc:

	65 hook = '[hooks]\npretxnchangegroup = return True'

	66 hgrc.write(hook)

	67 return str(temp_dir)
	Jon Sonesen 2017/06/13 16:30:20 returning the actual path object may be preferable returning the actual path object may be preferable here since it has a join method which will cut down using os.path.join, also even though youll have to use strpath when using subprocess youo have more control over the directory without calling subprocess as much. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/13 16:30:20, Jon Sonesen wrote: > returning the actual path object may be preferable here since it has a join > method which will cut down using os.path.join, also even though youll have to > use strpath when using subprocess youo have more control over the directory > without calling subprocess as much. Done.
	68

	69

	70 def test_extract_urls():

	71 assert url_list == extract_urls(hg_page)
	Jon Sonesen 2017/06/13 16:30:19 here is where you would just hand code the url lis here is where you would just hand code the url list. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/13 16:30:19, Jon Sonesen wrote: > here is where you would just hand code the url list. Done.
	72

	73

	74 def test_text_replacement(temp_repo):

	75 filename = os.path.join(temp_repo, 'sample_file.py')

	76 text_replace(temp_repo, filename)
	Jon Sonesen 2017/06/13 16:30:19 Have you tested that this is working? it seems tha Have you tested that this is working? it seems that you are giving a path which is not expected here. rosie 2017/06/23 14:39:23 In the temp_repo fixture, 'sample_file.py' is copi Show quoted text On 2017/06/13 16:30:19, Jon Sonesen wrote: > Have you tested that this is working? it seems that you are giving a path which > is not expected here. In the temp_repo fixture, 'sample_file.py' is copied into the temp_repo directory from the /data folder. This test seems to be working for me. Is it giving you an error?
	77 with open(filename) as file:

	78 try:

	79 text = file.read()

	80 except UnicodeDecodeError:
	Vasily Kuznetsov 2017/06/19 16:25:30 I think you shouldn't catch this. The test should I think you shouldn't catch this. The test should fail if we can't read the file for some reason. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > I think you shouldn't catch this. The test should fail if we can't read the file > for some reason. Done.
	81 print("Error: Couldn't read {}{}".format(temp_repo, filename))

	82 return

	83 pattern = re.compile(r'(copyright.*?\d{4})(?:-\d{4})?\s+eyeo gmbh',

	84 re.I)

	85 for year in re.finditer(pattern, text):

	86 dates = re.search(r'(\d{4})-(\d{4})', year.group(0))

	87 assert dates.group(2) == str(CURRENT_YEAR)

	88

	89

	90 def test_hg_commit(temp_repo):

	91 subprocess.check_call(['hg', 'clone', temp_repo, 'temp'])

	92 subprocess.check_call(['touch', 'temp/foo'])
	Jon Sonesen 2017/06/13 16:30:19 you will want to use path.join for the temp/foo he you will want to use path.join for the temp/foo here Vasily Kuznetsov 2017/06/19 16:25:30 And it's probably better to use something like `op Show quoted text On 2017/06/13 16:30:19, Jon Sonesen wrote: > you will want to use path.join for the temp/foo here And it's probably better to use something like `open(path, 'w').close()` to create the file instead of invoking `touch`. rosie 2017/06/23 14:39:23 Done. Show quoted text On 2017/06/13 16:30:19, Jon Sonesen wrote: > you will want to use path.join for the temp/foo here Done.
	93 subprocess.check_call(['hg', 'add', '--repository', 'temp'])

	94 hg_commit('temp', temp_repo)

	95

	96 # Make sure both files contain the commmit message from hg log

	97 log_1 = subprocess.run(['hg', 'log', '--repository', temp_repo],

	98 stdout=subprocess.PIPE)

	99 assert 'Noissue - Updated copyright year' in str(log_1.stdout)

	100 subprocess.call(['rm', '-r', 'temp/']) # cleanup
	Vasily Kuznetsov 2017/06/19 16:25:30 It would be better to use `shutil.rmtree` instead. It would be better to use `shutil.rmtree` instead. But there's an even better idea: instead of creating temporary files around the filesystem, use `tmpdir` fixture and create things inside of it, then you don't need to worry about cleanup. rosie 2017/06/23 14:39:22 Done. Show quoted text On 2017/06/19 16:25:30, Vasily Kuznetsov wrote: > It would be better to use `shutil.rmtree` instead. But there's an even better > idea: instead of creating temporary files around the filesystem, use `tmpdir` > fixture and create things inside of it, then you don't need to worry about > cleanup. Done.
	101

	102

	103 def test_all(base_dir):

	104 main(urllib.parse.urljoin('file://localhost', os.path.join(

	105 base_dir, 'hg_page.html')), base_dir)
	Jon Sonesen 2017/06/13 16:30:20 localhost isnt needed here, it is causing the test localhost isnt needed here, it is causing the test to fail. rosie 2017/06/23 14:39:21 Looks like I should have used 'file:///' instead o Show quoted text On 2017/06/13 16:30:20, Jon Sonesen wrote: > localhost isnt needed here, it is causing the test to fail. Looks like I should have used 'file:///' instead of 'file://localhost'. It seems to be working now on both Mint and Debian. rosie 2017/06/23 14:39:24 Done. Show quoted text On 2017/06/13 16:30:20, Jon Sonesen wrote: > localhost isnt needed here, it is causing the test to fail. Done.
	106

	107 # assert hg log for repo_1

	108 log_1 = subprocess.run(['hg', 'log', '--repository',

	109 os.path.join(base_dir, 'repo_1')],

	110 stdout=subprocess.PIPE)

	111 assert 'Noissue - Updated copyright year' in str(log_1.stdout)

	112

	113 # assert the .patch file for repo_2

	114 with open(os.path.join(data_dir, 'tst.patch'), 'r') as file1:

	115 with open('repo_2.patch') as file2:

	116 same = set(file1).intersection(file2)
	Vasily Kuznetsov 2017/06/19 16:25:29 This seems unnecessarily complicated. Perhaps you This seems unnecessarily complicated. Perhaps you don't really need `tst.patch` and you can just check if the line 'Noissue - Updated copyright year\n' is in `repo_2.patch`. Would not it be the same level of checking as now? rosie 2017/06/23 14:39:21 Done. Show quoted text On 2017/06/19 16:25:29, Vasily Kuznetsov wrote: > This seems unnecessarily complicated. Perhaps you don't really need `tst.patch` > and you can just check if the line 'Noissue - Updated copyright year\n' is in > `repo_2.patch`. Would not it be the same level of checking as now? Done.
	117 assert 'Noissue - Updated copyright year\n' in same

	118 subprocess.call(['rm', 'repo_2.patch'])

OLD	NEW

« update-copyright/tests/data/tst.patch ('K') | « update-copyright/tests/data/tst.patch ('k') | update-copyright/tox.ini » ('j') | update-copyright/tox.ini » ('J')