Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code

Delta Between Two Patch Sets: sitescripts/subscriptions/bin/updateMalwareDomainsList.py

Issue 29338412: Issue 3810 - Remove hardcoded default mirror list from Malware Domains List conversion script (Closed)
Left Patch Set: Created March 16, 2016, 11:22 a.m.
Right Patch Set: Rename mirror_list to mirrors. Created March 16, 2016, 11:56 a.m.
Left:
Right:
Use n/p to move between diff chunks; N/P to move between comments.
Jump to:
Left: Side by side diff | Download
Right: Side by side diff | Download
« no previous file with change/comment | « no previous file | no next file » | no next file with change/comment »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
LEFTRIGHT
1 # coding: utf-8 1 # coding: utf-8
2 2
3 # This file is part of the Adblock Plus web scripts, 3 # This file is part of the Adblock Plus web scripts,
4 # Copyright (C) 2006-2016 Eyeo GmbH 4 # Copyright (C) 2006-2016 Eyeo GmbH
5 # 5 #
6 # Adblock Plus is free software: you can redistribute it and/or modify 6 # Adblock Plus is free software: you can redistribute it and/or modify
7 # it under the terms of the GNU General Public License version 3 as 7 # it under the terms of the GNU General Public License version 3 as
8 # published by the Free Software Foundation. 8 # published by the Free Software Foundation.
9 # 9 #
10 # Adblock Plus is distributed in the hope that it will be useful, 10 # Adblock Plus is distributed in the hope that it will be useful,
(...skipping 24 matching lines...) Expand all
35 response = urllib2.urlopen(mirror + MALWAREDOMAINS_PATH) 35 response = urllib2.urlopen(mirror + MALWAREDOMAINS_PATH)
36 return response.read() 36 return response.read()
37 except urllib2.HTTPError: 37 except urllib2.HTTPError:
38 return None 38 return None
39 39
40 40
41 if __name__ == '__main__': 41 if __name__ == '__main__':
42 config = get_config() 42 config = get_config()
43 section = 'subscriptionDownloads' 43 section = 'subscriptionDownloads'
44 repository = config.get(section, 'malwaredomains_repository') 44 repository = config.get(section, 'malwaredomains_repository')
45 mirrors_list = config.get(section, 'malwaredomains_mirrors').split() 45 mirrors = config.get(section, 'malwaredomains_mirrors').split()
Sebastian Noack 2016/03/16 11:27:34 I guess we can call this variable simply "mirrors"
Vasily Kuznetsov 2016/03/16 11:56:59 And then later we'll have "for mirror in mirrors",
46 46
47 tempdir = tempfile.mkdtemp(prefix='malwaredomains') 47 tempdir = tempfile.mkdtemp(prefix='malwaredomains')
48 try: 48 try:
49 subprocess.check_call(['hg', '-q', 'clone', '-U', repository, tempdir]) 49 subprocess.check_call(['hg', '-q', 'clone', '-U', repository, tempdir])
50 subprocess.check_call(['hg', '-q', 'up', '-R', tempdir, '-r', 'default']) 50 subprocess.check_call(['hg', '-q', 'up', '-R', tempdir, '-r', 'default'])
51 51
52 path = os.path.join(tempdir, 'malwaredomains_full.txt') 52 path = os.path.join(tempdir, 'malwaredomains_full.txt')
53 file = codecs.open(path, 'wb', encoding='utf-8') 53 file = codecs.open(path, 'wb', encoding='utf-8')
54 54
55 print >>file, FILTERLIST_HEADER 55 print >>file, FILTERLIST_HEADER
56 56
57 for mirror in mirrors_list: 57 for mirror in mirrors:
58 data = try_mirror(mirror) 58 data = try_mirror(mirror)
59 if data is not None: 59 if data is not None:
60 break 60 break
61 else: 61 else:
62 sys.exit('Unable to fetch malware domains list.') 62 sys.exit('Unable to fetch malware domains list.')
63 63
64 zip = zipfile.ZipFile(StringIO(data), 'r') 64 zip = zipfile.ZipFile(StringIO(data), 'r')
65 info = zip.infolist()[0] 65 info = zip.infolist()[0]
66 for line in str(zip.read(info.filename)).splitlines(): 66 for line in str(zip.read(info.filename)).splitlines():
67 domain = line.strip() 67 domain = line.strip()
68 if not domain: 68 if not domain:
69 continue 69 continue
70 70
71 print >>file, '||%s^' % domain.decode('idna') 71 print >>file, '||%s^' % domain.decode('idna')
72 file.close(); 72 file.close();
73 73
74 if subprocess.check_output(['hg', 'stat', '-R', tempdir]) != '': 74 if subprocess.check_output(['hg', 'stat', '-R', tempdir]) != '':
75 subprocess.check_call(['hg', '-q', 'commit', '-R', tempdir, '-A', '-u', 'h gbot', '-m', 'Updated malwaredomains.com data']) 75 subprocess.check_call(['hg', '-q', 'commit', '-R', tempdir, '-A', '-u', 'h gbot', '-m', 'Updated malwaredomains.com data'])
76 subprocess.check_call(['hg', '-q', 'push', '-R', tempdir]) 76 subprocess.check_call(['hg', '-q', 'push', '-R', tempdir])
77 finally: 77 finally:
78 shutil.rmtree(tempdir, ignore_errors=True) 78 shutil.rmtree(tempdir, ignore_errors=True)
LEFTRIGHT
« no previous file | no next file » | Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Toggle Comments ('s')

Powered by Google App Engine
This is Rietveld