modules/adblockplus/files/web/static/deploy_script.py - Issue 29777652: #6145 - Introduce deploy script for websites

Delta Between Two Patch Sets: modules/adblockplus/files/web/static/deploy_script.py

Issue 29777652: #6145 - Introduce deploy script for websites (Closed)

Left Patch Set: Created May 10, 2018, 11:21 p.m.

Right Patch Set: Use of a different name convention for the script Created July 4, 2018, 2:12 p.m.

Left:
Right:

Use n/p to move between diff chunks; N/P to move between comments.

Jump to:

Left: Side by side diff | Download
Right: Side by side diff | Download

LEFT	RIGHT
1 #!/usr/bin/env python	1 #!/usr/bin/env python

	2 #

	3 # This file is part of the Adblock Plus infrastructure

	4 # Copyright (C) 2018-present eyeo GmbH

	5 #

	6 # Adblock Plus is free software: you can redistribute it and/or modify

	7 # it under the terms of the GNU General Public License version 3 as

	8 # published by the Free Software Foundation.

	9 #

	10 # Adblock Plus is distributed in the hope that it will be useful,

	11 # but WITHOUT ANY WARRANTY; without even the implied warranty of

	12 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

	13 # GNU General Public License for more details.

	14 #

	15 # You should have received a copy of the GNU General Public License

	16 # along with Adblock Plus. If not, see <http://www.gnu.org/licenses/>.

2	17

3 import argparse	18 import argparse

4 from contextlib import closing

5 from filecmp import dircmp	19 from filecmp import dircmp

6 import hashlib	20 import hashlib

7 import os	21 import os

8 import sys	22 import sys

9 import shutil	23 import shutil

10 import tarfile	24 import tarfile

11 import urllib2	25 import tempfile

	26 import urllib

12	27

13	28

14 def download(url):	29 __doc__ = """This script MUST be renamed in the form of $WEBSITE, e.g.

	30 help.eyeo.com, --name must be provided in order to fetch the

	31 files, expected files to be fetched are $NAME.tar.gz and $NAME.md5 in

	32 order to compare the hashes. --source must be an URL, e.g.

	33 https://helpcenter.eyeofiles.com"""

	34

	35

	36 def download(url, temporary_directory):

15 file_name = url.split('/')[-1]	37 file_name = url.split('/')[-1]

16 abs_file_name = os.path.join(os.path.dirname(os.path.realpath(__file__)),	38 absolute_file_path = os.path.join(temporary_directory, file_name)
Vasily Kuznetsov 2018/05/15 18:09:37 It seems that you are using the directory of this It seems that you are using the directory of this script as a temporary directory. Is there a particular reason to not use a real temporary directory instead? It seems unlikely that the location of this script will be a good place to stash temporary junk. f.lopez 2018/05/21 22:23:41 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > It seems that you are using the directory of this script as a temporary > directory. Is there a particular reason to not use a real temporary directory > instead? It seems unlikely that the location of this script will be a good place > to stash temporary junk. Acknowledged.
17 file_name)

18 print 'Downloading: ' + file_name	39 print 'Downloading: ' + file_name

19 try:	40 urllib.urlretrieve(url, absolute_file_path)

20 with closing(urllib2.urlopen(url)) as page:	41 return absolute_file_path
Vasily Kuznetsov 2018/05/15 18:09:37 This code is more or less doing `urllib.urlretriev This code is more or less doing `urllib.urlretrieve` (https://docs.python.org/2/library/urllib.html#urllib.urlretrieve). Have you thought of using it? f.lopez 2018/05/21 22:23:41 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > This code is more or less doing `urllib.urlretrieve` > (https://docs.python.org/2/library/urllib.html#urllib.urlretrieve). Have you > thought of using it? Acknowledged.
21 block_sz = 8912

22 with open(abs_file_name, 'wb') as f:

23 while True:

24 buffer = page.read(block_sz)

25 if not buffer:

26 break

27 f.write(buffer)

28 return abs_file_name

29 except urllib2.HTTPError as e:

30 if e.code == 404:

31 sys.exit("File not found on remote source")
Vasily Kuznetsov 2018/05/15 18:09:37 We have a rule for selecting the type of quotes in We have a rule for selecting the type of quotes in stings. Basically it's "use single quotes unless your string contains single quotes". See https://adblockplus.org/coding-style#python. f.lopez 2018/05/21 22:23:41 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > We have a rule for selecting the type of quotes in stings. Basically it's "use > single quotes unless your string contains single quotes". See > https://adblockplus.org/coding-style#python. Acknowledged.
32 except Exception as e:

33 sys.exit(e)

34	42

35	43

36 def calculate_md5(file):	44 def calculate_md5(file):

37 with open(file) as f:	45 with open(file) as file_handle:

38 data = f.read()	46 data = file_handle.read()

39 md5_result = hashlib.md5(data).hexdigest()	47 md5_result = hashlib.md5(data).hexdigest()

40 return md5_result.strip()	48 return md5_result.strip()

41	49

42	50

43 def read_md5(file):	51 def read_md5(file):

44 with open(file) as f:	52 with open(file) as file_handle:

45 md5_result = f.readline()	53 md5_result = file_handle.readline()

46 return md5_result.strip()	54 return md5_result.strip()

47	55

48	56

49 def untar(tar_file):	57 def untar(tar_file, temporary_directory):

50 if tarfile.is_tarfile(tar_file):	58 if tarfile.is_tarfile(tar_file):

51 with tarfile.open(tar_file, 'r:gz') as tar:	59 with tarfile.open(tar_file, 'r:gz') as tar:

52 tar.extractall(os.path.dirname(os.path.realpath(tar_file)))	60 tar.extractall(temporary_directory)

53 print 'Extracted in current directory'

54 return os.path.dirname(os.path.abspath(__file__))

55	61

56	62

57 def remove_tree(to_remove):	63 def remove_tree(to_remove):

58 if os.path.exists(to_remove):	64 if os.path.exists(to_remove):

59 if os.path.isdir(to_remove):	65 if os.path.isdir(to_remove):

60 shutil.rmtree(to_remove)	66 shutil.rmtree(to_remove)

61 else:	67 else:

62 os.remove(to_remove)	68 os.remove(to_remove)

63	69

64	70

65 def clean(hash):	71 def deploy_files(directory_comparison):

66 print "cleaning directory"	72 for name in directory_comparison.diff_files:

67 cwd = os.path.dirname(os.path.abspath(__file__))	73 copytree(directory_comparison.right, directory_comparison.left)

68 [remove_tree(os.path.join(cwd, x)) for x in os.listdir(cwd)	74 for name in directory_comparison.left_only:

69 if x.startswith(hash)]	75 remove_tree(os.path.join(directory_comparison.left, name))

	76 for name in directory_comparison.right_only:

	77 copytree(directory_comparison.right, directory_comparison.left)

	78 for subdirectory_comparison in directory_comparison.subdirs.values():

	79 deploy_files(subdirectory_comparison)

70	80

71	81

72 def deploy_files(dcmp):	82 # shutil.copytree copies a tree but the destination directory MUST NOT exist

73 for name in dcmp.diff_files:	83 # this might break the site for the duration of the files being deployed

74 copytree(dcmp.right, dcmp.left)	84 # for more info read: https://docs.python.org/2/library/shutil.html

75 for name in dcmp.left_only:	85 def copytree(source, destination):

76 remove_tree(dcmp.left + "/" + name)	86 if not os.path.exists(destination):

77 for name in dcmp.right_only:	87 os.makedirs(destination)

78 copytree(dcmp.right, dcmp.left)	88 shutil.copystat(source, destination)

79 for sub_dcmp in dcmp.subdirs.values():	89 source_items = os.listdir(source)

80 deploy_files(sub_dcmp)	90 for item in source_items:

81	91 source_path = os.path.join(source, item)

82	92 destination_path = os.path.join(destination, item)

83 def copytree(src, dst):	93 if os.path.isdir(source_path):

84 if not os.path.exists(dst):	94 copytree(source_path, destination_path)

85 os.makedirs(dst)

86 shutil.copystat(src, dst)

87 lst = os.listdir(src)

88 for item in lst:

89 s = os.path.join(src, item)

90 d = os.path.join(dst, item)

91 if os.path.isdir(s):

92 copytree(s, d)

93 else:	95 else:

94 shutil.copy2(s, d)	96 shutil.copy2(source_path, destination_path)

95	97

96	98

97 if __name__ == '__main__':	99 if __name__ == '__main__':

	100 website = os.path.basename(__file__)

98 parser = argparse.ArgumentParser(	101 parser = argparse.ArgumentParser(

99 description='''Fetch a compressed archive in the form of $HASH.tar.gz an d	102 description="""Fetch a compressed archive in the form of $NAME.tar.gz
Vasily Kuznetsov 2018/05/15 18:09:37 "Fetch" and "deploys" is inconsistent. It should b "Fetch" and "deploys" is inconsistent. It should be either "fetch and deploy" or "fetches and deploys". First one seems better. Vasily Kuznetsov 2018/05/15 18:09:37 It's nicer to be consistent with the type of quote It's nicer to be consistent with the type of quotes you use for multiline strings. Normally double quotes are preferred, like in line 101. f.lopez 2018/05/21 22:23:41 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > "Fetch" and "deploys" is inconsistent. It should be either "fetch and deploy" or > "fetches and deploys". First one seems better. Acknowledged. f.lopez 2018/05/21 22:23:41 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > "Fetch" and "deploys" is inconsistent. It should be either "fetch and deploy" or > "fetches and deploys". First one seems better. Acknowledged.
100 deploys it to /var/www/$WEBSITE folder''',	103 and deploy it to /var/www/{0} folder""".format(website),

101 epilog="""--hash must be provided in order to fetch the files,	104 epilog=__doc__,

102 expected files to be fetched are $HASH.tar.gz and $HASH.md5 in order to
Vasily Kuznetsov 2018/05/15 18:09:37 This indentation is a bit hard to follow. Perhaps This indentation is a bit hard to follow. Perhaps it would be better to add spaces in front of additional lines to make arguments clearly separated. f.lopez 2018/05/21 22:23:42 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > This indentation is a bit hard to follow. Perhaps it would be better to add > spaces in front of additional lines to make arguments clearly separated. Acknowledged.
103 compare the hashes.

104 --url and --domain are mutually exclusive, if url is provided

105 the files will be downloaded from $url/$HASH, otherwise the default

106 value will be fetched from $domain.eyeofiles.com/$HASH""",

107 )	105 )

108 parser.add_argument('--hash', action='store', type=str, nargs='?',	106 parser.add_argument('--name', action='store', type=str, required=True,
Vasily Kuznetsov 2018/05/15 18:09:37 'store' is the default action, so you don't really 'store' is the default action, so you don't really need to specify it. I guess it's ok if you do, but you can also just skip it. f.lopez 2018/05/21 22:23:41 I know this is the default action, but I think it Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > 'store' is the default action, so you don't really need to specify it. I guess > it's ok if you do, but you can also just skip it. I know this is the default action, but I think it is nice to see it when reading the code. Vasily Kuznetsov 2018/05/22 16:38:36 Acknowledged. Show quoted text On 2018/05/21 22:23:41, f.lopez wrote: > On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > > 'store' is the default action, so you don't really need to specify it. I guess > > it's ok if you do, but you can also just skip it. > > I know this is the default action, but I think it is nice to see it when reading > the code. Acknowledged.
109 help='Hash of the commit to deploy')	107 help='Name of the tarball to deploy')

110 parser.add_argument('--url', action='store', type=str,	108 parser.add_argument('--source', action='store', type=str, required=True,

111 help='URL where files will be downloaded')	109 help='The source where files will be downloaded')

112 parser.add_argument('--domain', action='store', type=str, nargs='?',	110 arguments = parser.parse_args()

113 help='''The domain to prepend	111 name = arguments.name

114 [eg. https://$domain.eyeofiles.com]''')	112 source = arguments.source

115 parser.add_argument('--website', action='store', type=str, nargs='?',	113 url_file = '{0}/{1}.tar.gz'.format(source, name)

116 help='The name of the website [e.g. help.eyeo.com]')	114 url_md5 = '{0}/{1}.md5'.format(source, name)

117 args = parser.parse_args()	115 temporary_directory = tempfile.mkdtemp()

118 hash = args.hash	116 try:

119 domain = args.domain	117 downloaded_file = download(url_file, temporary_directory)

120 if args.url:	118 downloaded_md5 = download(url_md5, temporary_directory)

121 url_file = '{0}/{1}.tar.gz'.format(args.url, hash)	119 if calculate_md5(downloaded_file) == read_md5(downloaded_md5):

122 url_md5 = '{0}/{1}.md5'.format(args.url, hash)	120 untar(downloaded_file, temporary_directory)

123 else:	121 tarball_directory = os.path.join(temporary_directory, name)

124 url_file = 'https://{0}.eyeofiles.com/{1}.tar.gz'.format(domain, hash)	122 destination = os.path.join('/var/www/', website)
Vasily Kuznetsov 2018/05/15 18:09:37 What if --domain is not provided? What if --domain is not provided? f.lopez 2018/05/21 22:23:41 You are right, I think it's better if I only provi Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > What if --domain is not provided? You are right, I think it's better if I only provide an argument called 'source' so instead of having a magical conditional you need to always provide the source of the files, this way there is no misunderstanding on what's happening behind the scenes. Vasily Kuznetsov 2018/05/22 16:38:36 Acknowledged. Show quoted text On 2018/05/21 22:23:41, f.lopez wrote: > On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > > What if --domain is not provided? > > You are right, I think it's better if I only provide an argument called 'source' > so instead of having a magical conditional you need to always provide the source > of the files, this way there is no misunderstanding on what's happening behind > the scenes. Acknowledged.
125 url_md5 = 'https://{0}.eyeofiles.com/{1}.md5'.format(domain, hash)	123 directory_comparison = dircmp(destination, tarball_directory)

126 down_file = download(url_file)	124 print 'Deploying files'

127 down_md5 = download(url_md5)	125 deploy_files(directory_comparison)

128 if calculate_md5(down_file) == read_md5(down_md5):	126 else:

129 tar_directory = untar(down_file)	127 error_message = """{0}.tar.gz md5 computation doesn't match {0}.md5

130 hash_directory = os.path.join(tar_directory, hash)	128 contents""".format(name)

131 destination = '/var/www/' + args.website	129 sys.exit(error_message)

132 dcmp = dircmp(destination, hash_directory)	130 except Exception as error:

133 deploy_files(dcmp)	131 sys.exit(error)

134 clean(hash)	132 finally:
Vasily Kuznetsov 2018/05/15 18:09:37 `clean()` won't be called in cases of errors -- is `clean()` won't be called in cases of errors -- is this intentional? f.lopez 2018/05/21 22:23:42 Acknowledged. Show quoted text On 2018/05/15 18:09:37, Vasily Kuznetsov wrote: > `clean()` won't be called in cases of errors -- is this intentional? Acknowledged.
135 else:	133 shutil.rmtree(temporary_directory)

136 sys.exit("Hashes don't match")

LEFT	RIGHT