|
|
Created:
Sept. 12, 2016, 11 a.m. by Jon Sonesen Modified:
Sept. 13, 2016, 11:48 a.m. CC:
saroyanm, Philip Hill Visibility:
Public. |
DescriptionIssue 4377 - Add Configurable Form to Email Service
Repository: hg.adblockplus.org/sitescripts
Patch Set 1 #
Total comments: 6
Patch Set 2 : addressed comments #Patch Set 3 : adds test and fixed logic in handler #Patch Set 4 : fixed make_error to properly build message and fix test to fail if message not built #
Total comments: 17
MessagesTotal messages: 14
Hey guys, so we are gonna want to try and get this review done ASAP so that the others can start testing. Hopefully today so maybe when Sebastian arrives we can do it together over lunch :D
https://codereview.adblockplus.org/29352643/diff/29352644/.sitescripts.example File .sitescripts.example (right): https://codereview.adblockplus.org/29352643/diff/29352644/.sitescripts.exampl... .sitescripts.example:15: sitescripts.formmail.web.formmail = We also need to add formmail2 here so the multiplexer picks it up. https://codereview.adblockplus.org/29352643/diff/29352644/.sitescripts.exampl... .sitescripts.example:165: test.fields.subject=mandatory I guess we don't need several mandatory fields, they will all behave the same anyway. Perhaps it would be more useful to have a non-mandatory email field to check that if there's no content in the field at all, it won't fail because it's not an email. So in the end we would have 1 mandatory field, 1 mandatory email, 1 non-mandatory email and one non-mandatory field. https://codereview.adblockplus.org/29352643/diff/29352644/.sitescripts.exampl... .sitescripts.example:171: Too many empty lines perhaps. https://codereview.adblockplus.org/29352643/diff/29352644/sitescripts/formmai... File sitescripts/formmail/test/test_formmail2.py (right): https://codereview.adblockplus.org/29352643/diff/29352644/sitescripts/formmai... sitescripts/formmail/test/test_formmail2.py:105: def test_non_mandatory_msg(response_for, form_data, mocker): This test seems redundant. Isn't it the same as the first lines of `test_success`? https://codereview.adblockplus.org/29352643/diff/29352644/sitescripts/formmai... File sitescripts/formmail/web/formmail2.py (right): https://codereview.adblockplus.org/29352643/diff/29352644/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:82: start_response('400 Bad Request'+str(error), response_headers) There should be spaces around that '+' (PEP8). Dunno why flake8 doesn't pick it up... https://codereview.adblockplus.org/29352643/diff/29352644/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:120: if 'mandatory' in spec: This should be `if 'email' in spec:`, etc. Also, this code duplication is not great. Perhaps we could use a function like this to abstract this error message logic: def make_error(field_name, spec, check_type, error_message): if check_type in spec: return spec[check_type].value return error_message.format(field_name)
addressed comments
adds test and fixed logic in handler
How about replacing "formmail", rather than adding another module? I suppose you did it that way for backwards compatibility. But couldn't we just keep the new implementation backwards compatible in the case there is only one context? Otherwise, the migration path will be rather inconvenient.
On 2016/09/12 14:11:50, Sebastian Noack wrote: > How about replacing "formmail", rather than adding another module? I suppose you > did it that way for backwards compatibility. But couldn't we just keep the new > implementation backwards compatible in the case there is only one context? > Otherwise, the migration path will be rather inconvenient. That's a good point. My original plan was to eventually migrate existing `formmail` installation(s) to `formmail2` (and then probably rename it to "formmail"). We're doing this in two steps because Manvel wants to have the Acceptable Ads website up and running today and I would like to not be constrained by migrating the existing `formmail` setup on www.eyeo.com. If we choose the backwards compatible approach, we don't need to migrate the existing www.eyeo.com setup but we'd still need to test that we haven't broken it -- that's extra overhead now. In any case it seems better to remove this special case (and add configuration for www.eyeo.com) in the final solution, so we still have some work for later too. As long as we agree on the overall direction of merging the two later (I'll create the ticket) it makes sense to proceed with this implementation since it's the quickest way to put Acceptable Ads website up. Does this make sense?
Yes, we should unify the setup afterwards, and remove the legacy code, either way. This can be done in one step if we extend the existing code and configuration in a backwards-compatible way. However, if we fork formmail instead, as done by this patch, besides having some code duplication for the time being, the migration path will be more complicated: 1. Migrate eyeo.com to "formmail2" 2. Replace the old implementation of "formmail" with "formmail2" and add an alias 3. Migrate eyeo.com and acceptableads.org to use "formmail" 4. Remove the alias While speaking of the time constraint, this went anything but great. Adding Philip to CC, as he project manages the new Acceptable Ads website. The sitescripts team implementing server-side features last minute, after the UI has already been implemented, for a website that have been planned for weeks, is certainly not how things should be done. That said, IMO they should just use a mailto: link for now, until we have this issue properly sorted out and implemented in the way that makes most sense. Rushing things, usually never work out. Even if we land the patch as-is, it still needs to be deployed and tested, which is already optimistic enough, given the deadline. However, I leave the decision to Vasily, as long as he takes care that the migration/cleanup mentioned above happens not much later after the release of the new Acceptable Ads website. And I really hope that last minute feature requests like that, are not becoming daily business!
Thanks, Sebastian, this is very constructive. I definitely second your comments about how this was planned and I think we should communicate better in the future to make sure we don't need to make compromises like this. In any case, it seems to me that it would be better for us to have a more functional AA website online by Wednesday even if it means a bit of additional work. Given that we're rushing this commit, making sure that we can't break eyeo.com was high on my priorities list, so I really wanted to isolate this change from the code that runs there. That why I chose the "forking" approach even though it entails more migration steps later. But apart from a bit more work we're not really creating any problems here. The code is solid, it has good test coverage and after we remove the duplication of functionality with `formmail.py` there will be nothing to worry about. If nobody objects, I would like to push this change and let Matze set up AA website using the new handler. The migration work will be tracked as #4413.
LGTM
I'm afraid that even if we do it that way, those changes aren't ready to land yet. I had a closer look at the code, see my comments. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... File sitescripts/formmail/web/formmail2.py (right): https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:20: from sitescripts.utils import get_config, sendMail, setupStderr There should be a blank line between corelib and own module imports. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:29: __slots__ = ('value',) This micro optimization doesn't do anything, since the base class, i.e. OrderedDict, requires instances to have a __dict__, anyway. The only thing __slots__ does, is skipping the creation of a per instance attributes dictionary, if possible. The attributes listed will be the only attributes that can be accessed on those instances then. Otherwise, any name can be accessed as attribute. However, if you derive from a class that requires a __dict__ for its instances, i.e. any non-builtin class that doesn't define __slots__, it doesn't have any effect to set __slots__ in the derived class. FWIW, it seems cleaner anyway, to just pass the dictionary and that value with a tuple, rather than storing the latter as an attribute of the former. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:60: setupStderr(environ['wsgi.errors']) setupStderr() is depreacted. You should actually see a deprecation warning when running that code from a terminal. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:71: request_body_length = int(environ['CONTENT_LENGTH']) If CONTENT_LENGTH, is not given but required, 411 rather than 400, should be returned. Also note that the logic here is mostly redundant with the sitescripts.web.form_handler() decorator. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:107: spec.value = set(spec.value.replace(' ', '').split(',')) It seems you want to ignore optional spaces around commas here. However, if you eliminate spaces separately before you slit the string, this will turn 'foo bar, baz' into ['foobar', 'baz'] which most likely isn't intended. It seems using a regexp is more appropriate here: re.split(r'\s*,\s*', spec.value) https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:115: email_regex = r'^\w[\w.+!-]+@\w[\w.-]+\.[a-zA-Z]{2,6}$' There is utils.encode_email_address(), which we use across sitescripts, in particular in new code, in order to validate email addresses. So we should use it here as well, rather than implementing yet another validation method. FWIW, the regexp here has some assumptions that will not necessarily always be true, yet there are other ways to construct an invalid email address that it won't catch. Correctly validating email addresses is hard till impossible, but fortunately all we need to check for is that there an @-sign, no white space, and only ASCII characters. Every mail server, client or API can safely handle addresses in that form, and will just return an error, as it might do anyway, if there is still something wrong with the address. encode_email_address() does exactly that, and also handles IDN domains.
Thanks for looking into it, Sebastian. I agree pretty much with all your comments. Since Jon is technically on vacation from today, I'm not sure if he'll address these himself, but if not, I will take over from here. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... File sitescripts/formmail/web/formmail2.py (right): https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:29: __slots__ = ('value',) On 2016/09/12 22:11:50, Sebastian Noack wrote: > This micro optimization doesn't do anything, since the base class, i.e. > OrderedDict, requires instances to have a __dict__, anyway. The only thing > __slots__ does, is skipping the creation of a per instance attributes > dictionary, if possible. The attributes listed will be the only attributes that > can be accessed on those instances then. Otherwise, any name can be accessed as > attribute. However, if you derive from a class that requires a __dict__ for its > instances, i.e. any non-builtin class that doesn't define __slots__, it doesn't > have any effect to set __slots__ in the derived class. > > FWIW, it seems cleaner anyway, to just pass the dictionary and that value with a > tuple, rather than storing the latter as an attribute of the former. Yeah, with OrderedDict __slots__ is pretty much useless. Originally this code was inheriting from dict, so it made more sense. Since we're building a tree of dictionaries here, the approach with tuples would make navigating the whole data structure more complicated, but we could just remove ConfDict class, and replace it with OrderedDict in the rest of the code. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:60: setupStderr(environ['wsgi.errors']) On 2016/09/12 22:11:49, Sebastian Noack wrote: > setupStderr() is depreacted. You should actually see a deprecation warning when > running that code from a terminal. Yeah, I saw the warning. Since we're not writing anything to `stderr` I suppose we could just delete this line. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:71: request_body_length = int(environ['CONTENT_LENGTH']) On 2016/09/12 22:11:49, Sebastian Noack wrote: > If CONTENT_LENGTH, is not given but required, 411 rather than 400, should be > returned. Also note that the logic here is mostly redundant with the > sitescripts.web.form_handler() decorator. Yes, that decorator does pretty much everything that the `post_handler` here does except for converting exceptions to error responses. I think the best solution would be to delete lines 60-78 here, add `params` as the last argument of `wrapped_handler` and then apply `form_handler` decorator to it. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:107: spec.value = set(spec.value.replace(' ', '').split(',')) On 2016/09/12 22:11:49, Sebastian Noack wrote: > It seems you want to ignore optional spaces around commas here. However, if you > eliminate spaces separately before you slit the string, this will turn 'foo bar, > baz' into ['foobar', 'baz'] which most likely isn't intended. It seems using a > regexp is more appropriate here: > > re.split(r'\s*,\s*', spec.value) Perhaps regexes are a bit overkill here. How about: spec.value = set(s.strip() for s in spec.value.split(',')) https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:115: email_regex = r'^\w[\w.+!-]+@\w[\w.-]+\.[a-zA-Z]{2,6}$' On 2016/09/12 22:11:50, Sebastian Noack wrote: > There is utils.encode_email_address(), which we use across sitescripts, in > particular in new code, in order to validate email addresses. So we should use > it here as well, rather than implementing yet another validation method. > > FWIW, the regexp here has some assumptions that will not necessarily always be > true, yet there are other ways to construct an invalid email address that it > won't catch. Correctly validating email addresses is hard till impossible, but > fortunately all we need to check for is that there an @-sign, no white space, > and only ASCII characters. Every mail server, client or API can safely handle > addresses in that form, and will just return an error, as it might do anyway, if > there is still something wrong with the address. encode_email_address() does > exactly that, and also handles IDN domains. Yeah, this regex was just copied from formmail.py. Using encode_email_address makes more sense.
I addressed the last round of comments. The review continues here: https://codereview.adblockplus.org/29352643/ https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... File sitescripts/formmail/web/formmail2.py (right): https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:20: from sitescripts.utils import get_config, sendMail, setupStderr On 2016/09/12 22:11:49, Sebastian Noack wrote: > There should be a blank line between corelib and own module imports. Done. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:29: __slots__ = ('value',) On 2016/09/13 07:15:53, Vasily Kuznetsov wrote: > On 2016/09/12 22:11:50, Sebastian Noack wrote: > > This micro optimization doesn't do anything, since the base class, i.e. > > OrderedDict, requires instances to have a __dict__, anyway. The only thing > > __slots__ does, is skipping the creation of a per instance attributes > > dictionary, if possible. The attributes listed will be the only attributes > that > > can be accessed on those instances then. Otherwise, any name can be accessed > as > > attribute. However, if you derive from a class that requires a __dict__ for > its > > instances, i.e. any non-builtin class that doesn't define __slots__, it > doesn't > > have any effect to set __slots__ in the derived class. > > > > FWIW, it seems cleaner anyway, to just pass the dictionary and that value with > a > > tuple, rather than storing the latter as an attribute of the former. > > Yeah, with OrderedDict __slots__ is pretty much useless. Originally this code > was inheriting from dict, so it made more sense. > > Since we're building a tree of dictionaries here, the approach with tuples would > make navigating the whole data structure more complicated, but we could just > remove ConfDict class, and replace it with OrderedDict in the rest of the code. Done. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:60: setupStderr(environ['wsgi.errors']) On 2016/09/13 07:15:53, Vasily Kuznetsov wrote: > On 2016/09/12 22:11:49, Sebastian Noack wrote: > > setupStderr() is depreacted. You should actually see a deprecation warning > when > > running that code from a terminal. > > Yeah, I saw the warning. Since we're not writing anything to `stderr` I suppose > we could just delete this line. Done. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:71: request_body_length = int(environ['CONTENT_LENGTH']) On 2016/09/13 07:15:52, Vasily Kuznetsov wrote: > On 2016/09/12 22:11:49, Sebastian Noack wrote: > > If CONTENT_LENGTH, is not given but required, 411 rather than 400, should be > > returned. Also note that the logic here is mostly redundant with the > > sitescripts.web.form_handler() decorator. > > Yes, that decorator does pretty much everything that the `post_handler` here > does except for converting exceptions to error responses. I think the best > solution would be to delete lines 60-78 here, add `params` as the last argument > of `wrapped_handler` and then apply `form_handler` decorator to it. In the end after removing all `form_handler` logic it didn't seem like having this decorator made sense. I got rid of it and of the BadRequestError exception. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:107: spec.value = set(spec.value.replace(' ', '').split(',')) On 2016/09/13 07:15:53, Vasily Kuznetsov wrote: > On 2016/09/12 22:11:49, Sebastian Noack wrote: > > It seems you want to ignore optional spaces around commas here. However, if > you > > eliminate spaces separately before you slit the string, this will turn 'foo > bar, > > baz' into ['foobar', 'baz'] which most likely isn't intended. It seems using a > > regexp is more appropriate here: > > > > re.split(r'\s*,\s*', spec.value) > > Perhaps regexes are a bit overkill here. How about: > > spec.value = set(s.strip() for s in spec.value.split(',')) Done. https://codereview.adblockplus.org/29352643/diff/29352714/sitescripts/formmai... sitescripts/formmail/web/formmail2.py:115: email_regex = r'^\w[\w.+!-]+@\w[\w.-]+\.[a-zA-Z]{2,6}$' On 2016/09/13 07:15:52, Vasily Kuznetsov wrote: > On 2016/09/12 22:11:50, Sebastian Noack wrote: > > There is utils.encode_email_address(), which we use across sitescripts, in > > particular in new code, in order to validate email addresses. So we should use > > it here as well, rather than implementing yet another validation method. > > > > FWIW, the regexp here has some assumptions that will not necessarily always be > > true, yet there are other ways to construct an invalid email address that it > > won't catch. Correctly validating email addresses is hard till impossible, but > > fortunately all we need to check for is that there an @-sign, no white space, > > and only ASCII characters. Every mail server, client or API can safely handle > > addresses in that form, and will just return an error, as it might do anyway, > if > > there is still something wrong with the address. encode_email_address() does > > exactly that, and also handles IDN domains. > > Yeah, this regex was just copied from formmail.py. Using encode_email_address > makes more sense. Done.
Sorry, wrong link, actually the new review is this: https://codereview.adblockplus.org/29352795/ |