Issue 29370597: Issue 4687 - Add Context Function get_pages_metadata to Test Site

Jon Sonesen

Jan. 2, 2017, 9:30 a.m. (2017-01-02 09:30:22 UTC) #1

Jon Sonesen

Here is my implementation of the page metadata fetcher. I found that splitting the metadata ...

Jan. 2, 2017, 12:40 p.m. (2017-01-02 12:40:49 UTC) #2

Vasily Kuznetsov

Looks good and it looks like it works. However, should not we also add a ...

Jan. 10, 2017, 7:34 p.m. (2017-01-10 19:34:25 UTC) #3

Vasily Kuznetsov

I looked at the code a bit more and played with the output, see the ...

Jan. 11, 2017, 10:52 a.m. (2017-01-11 10:52:44 UTC) #4

Jon Sonesen

Thanks for taking the time to look closer at this. My apologies on the oversights, ...

Jan. 11, 2017, 5:37 p.m. (2017-01-11 17:37:04 UTC) #5

Thanks for taking the time to look closer at this. My apologies on the
oversights, which have been addressed in the second patch set. Additionally as
per Sebastian's comments regarding where this patch should land I have cc'd
Julian so he can get the global file patch once it gets the go ahead to land.

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:12: path =
context['source'].page_filename(page_name, _format)
On 2017/01/11 10:52:44, Vasily Kuznetsov wrote:
> You can replace this and the following line with:
> 
> data, filename = context['source'].read_page(page_name, _format)
> 
> It basically does the same thing inside.

Done.

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:22: def parse_page_metadata(data,
page):
On 2017/01/11 10:52:44, Vasily Kuznetsov wrote:
> The return value of this function doesn't contain the `page` key that is given
> in the examples in the ticket. Also `page` argument to the function is not
used,
> and it was probably intended for that.

Done.

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:23: page_metadata = {}
On 2017/01/10 19:34:25, Vasily Kuznetsov wrote:
> Seems like one level of indentation was lost here somehow.

Done.

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/pag...
File tests/test_site/pages/sitemap.tmpl (right):

https://codereview.adblockplus.org/29370597/diff/29370598/tests/test_site/pag...
tests/test_site/pages/sitemap.tmpl:14: {%- for page in
get_pages_metadata({'tags': 'popular, bar'}) -%}
On 2017/01/11 10:52:44, Vasily Kuznetsov wrote:
> I see you changed the API a little from what is described in the ticket. The
way
> it is there this call would be:
> 
> get_pages_metadata({'tags': ['popular', 'bar']})
> 
> I don't mind the change, it seems that it actually simplifies both the
> implementation and the usage. You should probably update the ticket though, to
> make sure it's all in sync.

Acknowledged.

Jon Sonesen

On 2017/01/10 19:34:25, Vasily Kuznetsov wrote: > Looks good and it looks like it works. ...

Jan. 11, 2017, 6:41 p.m. (2017-01-11 18:41:27 UTC) #6

Vasily Kuznetsov

Hi Jon! Looking better now but there are a few more things to correct, see ...

Jan. 12, 2017, 10:37 a.m. (2017-01-12 10:37:47 UTC) #7

Sebastian Noack

Perhaps we should still add tests to CMS in order to guarantee the APIs used ...

Jan. 12, 2017, 11:42 a.m. (2017-01-12 11:42:11 UTC) #8

Jon Sonesen

Patch Set 2 I appear to have uploaded a bad patch. https://codereview.adblockplus.org/29370597/diff/29371576/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): ...

Jan. 19, 2017, 7:51 a.m. (2017-01-19 07:51:40 UTC) #9

Jon Sonesen

Addressed previous errors and added a default template as setting the template to empty on ...

Jan. 19, 2017, 9:14 a.m. (2017-01-19 09:14:39 UTC) #10

Vasily Kuznetsov

Hi Jon, Looks good. I noticed that we treat multiple filer values for the same ...

Jan. 20, 2017, 11:53 a.m. (2017-01-20 11:53:38 UTC) #11

Jon Sonesen

On 2017/01/20 11:53:38, Vasily Kuznetsov wrote: > Hi Jon, > > Looks good. I noticed ...

Jan. 23, 2017, 1:52 p.m. (2017-01-23 13:52:07 UTC) #12

juliandoucette

See response below. https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode40 tests/test_site/globals/get_pages_metadata.py:40: if option not in metadata[filter_name]: On ...

Jan. 23, 2017, 4:59 p.m. (2017-01-23 16:59:31 UTC) #13

juliandoucette

Something came to mind. https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode40 tests/test_site/globals/get_pages_metadata.py:40: if option not in metadata[filter_name]: ...

Jan. 24, 2017, 1:05 a.m. (2017-01-24 01:05:48 UTC) #14

juliandoucette

Corrections. I just realized that what I said before didn't make sense. https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py ...

Jan. 24, 2017, 1:10 a.m. (2017-01-24 01:10:29 UTC) #15

Vasily Kuznetsov

Thanks for the answer, Julian. I think we got a bit out of sync here, ...

Jan. 24, 2017, 9:24 a.m. (2017-01-24 09:24:46 UTC) #16

juliandoucette

See response below. https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode40 tests/test_site/globals/get_pages_metadata.py:40: if option not in metadata[filter_name]: On ...

Jan. 24, 2017, 7:32 p.m. (2017-01-24 19:32:56 UTC) #17

Vasily Kuznetsov

Thanks for the clarification, Julian. LGTM https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode40 tests/test_site/globals/get_pages_metadata.py:40: if option not ...

Jan. 25, 2017, 11:18 a.m. (2017-01-25 11:18:19 UTC) #18

Thanks for the clarification, Julian.

LGTM

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:40: if option not in
metadata[filter_name]:
On 2017/01/24 19:32:55, juliandoucette wrote:
> On 2017/01/24 09:24:46, Vasily Kuznetsov wrote:
> > On 2017/01/24 01:10:29, juliandoucette wrote:
> > > On 2017/01/23 16:59:31, juliandoucette wrote:
> > > > On 2017/01/20 11:53:38, Vasily Kuznetsov wrote:
> > > > > Here if a list of options was given for the same field name, we
require
> > that
> > > > all
> > > > > of them are present. This makes sense in some cases but I wonder if
this
> > is
> > > > what
> > > > > we always want. The ticket is not completely clear about this in my
> > opinion.
> > > > > Perhaps Julian can tell us what would be preferred.
> > > > 
> > > > Good question. A list of required fields is a requirement for help
center.
> A
> > > > list of optional fields or mixed required and optional fields is not
> > required
> > > > for help center. I don't have a strong opinion about if and how you
should
> > > > implement optional fields.
> > > 
> > > - Replace "field" with "value" in my response above
> > > - I don't have a strong opinion about if **or** how to implement optional
> > filter
> > > values**
> > > 
> > > (Sorry for the confusion)
> > 
> > Ok, I think we're not quite on the same page here so perhaps I formulated my
> > question badly. Here's an example to illustrate what I mean:
> > 
> > page1:
> > topic=philosophy
> > 
> > page2:
> > topic=philosophy,religion
> > 
> > Now if we invoke get_pages_metadata(topic='philosophy,religion'), do we want
> to
> > get the metadata from page1 and page2 or only from page2?
> > 
> > Implementation is easy either way, I was just wondering what would be more
> > useful for you.
> 
> I think I understood your question correctly.
> 
> Help center requires "only from page 2".

Ok, in this case I misunderstood your previous answer. But now it's clear.

Jon Sonesen

On 2017/01/12 11:42:11, Sebastian Noack wrote: > Perhaps we should still add tests to CMS ...

Jan. 25, 2017, 12:07 p.m. (2017-01-25 12:07:25 UTC) #19

Vasily Kuznetsov

On 2017/01/25 12:07:25, Jon Sonesen wrote: > On 2017/01/12 11:42:11, Sebastian Noack wrote: > > ...

Jan. 25, 2017, 12:35 p.m. (2017-01-25 12:35:46 UTC) #20

juliandoucette

NOT LGTM https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode28 tests/test_site/globals/get_pages_metadata.py:28: value = tuple(value.strip().split(',')) 1. It's annoying to ...

Feb. 27, 2017, 9:27 p.m. (2017-02-27 21:27:32 UTC) #21

Vasily Kuznetsov

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/globals/get_pages_metadata.py#newcode28 tests/test_site/globals/get_pages_metadata.py:28: value = tuple(value.strip().split(',')) On 2017/02/27 21:27:31, juliandoucette wrote: > ...

Feb. 28, 2017, 11:24 a.m. (2017-02-28 11:24:10 UTC) #22

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:28: value =
tuple(value.strip().split(','))
On 2017/02/27 21:27:31, juliandoucette wrote:
> 1. It's annoying to have single values as arrays
>   - e.g. {{ post[title][0] }} vs {{ post[title] }}
> 2. We don't want to split all values by ","
>   - e.g. we may have commas in dates or the description of a page

Yeah, makes sense. I suppose then not every metadata field should be treated as
a list by default. Perhaps a reasonable default would be to treat them as a
strings.

However, there are fields that probably should be lists, for example the tags.
We could come up with some kind of notation for those. One way to do it would be
to say that if the value starts with a square bracket ("["), then it's a list
and it should be split by comma so that:

    tags = [foo, bar]

will produce the value which is a list: ["foo", "bar"] and things like this:

    date = Wed, Nov 5

will stay a string: "Wed, Nov 5".

Another way to do it is to allow something like this syntax for lists:

    tags = foo
               bar
               baz
    title = My awesome page

This would be a bit harder to parse and would require changes to CMS core to be
compatible, so it wouldn't be my first preference.

We could also adopt a different approach and treat everything as a string but
replace the exact matching with substring matching. Then if we have this in the
page header:

    tags = foo, bar

and we look for pages tagged with "foo", it would be found. What I don't like
about this approach is that a page with this in the header would also be
matched:

    tags = foobar, quux

but "foobar" seems like a different tag. We could probably fix that by requiring
that if not the whole string is matched, the end of the match that is inside of
the string should be at a word boundary or a comma or something like that. It
would be more fiddly, but we can probably end up with something reasonably
fool-proof. In this case all metadata fields would be returned as strings
(nothing would be converted to a list), which might or might not be what you
want.

Do any of these solutions sound acceptable or do you have other ideas?

Jon Sonesen

Hey Julian, Thanks for pointing this out, I agree here it would be annoying to ...

Feb. 28, 2017, 11:59 a.m. (2017-02-28 11:59:18 UTC) #23

Hey Julian, 

Thanks for pointing this out, I agree here it would be annoying to always be
getting [0] if it there is a singular value in the metadata field. 

Vasily, 

thanks for the suggestions. While I think they are viable I was thinking it
would be easiest to just always make it a list unless the resulting list has a
len less than 2 in which case we just set it back to a string. It is not elegant
but also it is simple to understand and implement.

I guess it is really up to us as far as implementation details since the
resulting usage should be the same from Julian's perspective.

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:28: value =
tuple(value.strip().split(','))
On 2017/02/28 11:24:09, Vasily Kuznetsov wrote:
> On 2017/02/27 21:27:31, juliandoucette wrote:
> > 1. It's annoying to have single values as arrays
> >   - e.g. {{ post[title][0] }} vs {{ post[title] }}
> > 2. We don't want to split all values by ","
> >   - e.g. we may have commas in dates or the description of a page
> 
> Yeah, makes sense. I suppose then not every metadata field should be treated
as
> a list by default. Perhaps a reasonable default would be to treat them as a
> strings.
> 
> However, there are fields that probably should be lists, for example the tags.
> We could come up with some kind of notation for those. One way to do it would
be
> to say that if the value starts with a square bracket ("["), then it's a list
> and it should be split by comma so that:
> 
>     tags = [foo, bar]
> 
> will produce the value which is a list: ["foo", "bar"] and things like this:
> 
>     date = Wed, Nov 5
> 
> will stay a string: "Wed, Nov 5".
> 
> Another way to do it is to allow something like this syntax for lists:
> 
>     tags = foo
>                bar
>                baz
>     title = My awesome page
> 
> This would be a bit harder to parse and would require changes to CMS core to
be
> compatible, so it wouldn't be my first preference.
> 
> We could also adopt a different approach and treat everything as a string but
> replace the exact matching with substring matching. Then if we have this in
the
> page header:
> 
>     tags = foo, bar
> 
> and we look for pages tagged with "foo", it would be found. What I don't like
> about this approach is that a page with this in the header would also be
> matched:
> 
>     tags = foobar, quux
> 
> but "foobar" seems like a different tag. We could probably fix that by
requiring
> that if not the whole string is matched, the end of the match that is inside
of
> the string should be at a word boundary or a comma or something like that. It
> would be more fiddly, but we can probably end up with something reasonably
> fool-proof. In this case all metadata fields would be returned as strings
> (nothing would be converted to a list), which might or might not be what you
> want.
> 
> Do any of these solutions sound acceptable or do you have other ideas?

could we not just check that the value is not more than one and if so don't make
it a list?

Vasily Kuznetsov

On 2017/02/28 11:59:18, Jon Sonesen wrote: > Hey Julian, > > Thanks for pointing this ...

March 1, 2017, 4:46 p.m. (2017-03-01 16:46:30 UTC) #24

On 2017/02/28 11:59:18, Jon Sonesen wrote:
> Hey Julian, 
> 
> Thanks for pointing this out, I agree here it would be annoying to always be
> getting [0] if it there is a singular value in the metadata field. 
> 
> Vasily, 
> 
> thanks for the suggestions. While I think they are viable I was thinking it
> would be easiest to just always make it a list unless the resulting list has a
> len less than 2 in which case we just set it back to a string. It is not
elegant
> but also it is simple to understand and implement.
> 
> I guess it is really up to us as far as implementation details since the
> resulting usage should be the same from Julian's perspective.
> 
>
https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
> File tests/test_site/globals/get_pages_metadata.py (right):
> 
>
https://codereview.adblockplus.org/29370597/diff/29372661/tests/test_site/glo...
> tests/test_site/globals/get_pages_metadata.py:28: value =
> tuple(value.strip().split(','))
> On 2017/02/28 11:24:09, Vasily Kuznetsov wrote:
> > On 2017/02/27 21:27:31, juliandoucette wrote:
> > > 1. It's annoying to have single values as arrays
> > >   - e.g. {{ post[title][0] }} vs {{ post[title] }}
> > > 2. We don't want to split all values by ","
> > >   - e.g. we may have commas in dates or the description of a page
> > 
> > Yeah, makes sense. I suppose then not every metadata field should be treated
> as
> > a list by default. Perhaps a reasonable default would be to treat them as a
> > strings.
> > 
> > However, there are fields that probably should be lists, for example the
tags.
> > We could come up with some kind of notation for those. One way to do it
would
> be
> > to say that if the value starts with a square bracket ("["), then it's a
list
> > and it should be split by comma so that:
> > 
> >     tags = [foo, bar]
> > 
> > will produce the value which is a list: ["foo", "bar"] and things like this:
> > 
> >     date = Wed, Nov 5
> > 
> > will stay a string: "Wed, Nov 5".
> > 
> > Another way to do it is to allow something like this syntax for lists:
> > 
> >     tags = foo
> >                bar
> >                baz
> >     title = My awesome page
> > 
> > This would be a bit harder to parse and would require changes to CMS core to
> be
> > compatible, so it wouldn't be my first preference.
> > 
> > We could also adopt a different approach and treat everything as a string
but
> > replace the exact matching with substring matching. Then if we have this in
> the
> > page header:
> > 
> >     tags = foo, bar
> > 
> > and we look for pages tagged with "foo", it would be found. What I don't
like
> > about this approach is that a page with this in the header would also be
> > matched:
> > 
> >     tags = foobar, quux
> > 
> > but "foobar" seems like a different tag. We could probably fix that by
> requiring
> > that if not the whole string is matched, the end of the match that is inside
> of
> > the string should be at a word boundary or a comma or something like that.
It
> > would be more fiddly, but we can probably end up with something reasonably
> > fool-proof. In this case all metadata fields would be returned as strings
> > (nothing would be converted to a list), which might or might not be what you
> > want.
> > 
> > Do any of these solutions sound acceptable or do you have other ideas?
> 
> could we not just check that the value is not more than one and if so don't
make
> it a list?

Well, Julian above says that some non-list strings have commas in them so they
would become broken by this approach. Also in some cases variables that should
be lists might have only one element in them (tags for example). If the template
depends on those variables being lists, it might break. So it seems like we
would need something more robust.

juliandoucette

# On one hand String values by default and list values when string starts with ...

March 5, 2017, 9 p.m. (2017-03-05 21:00:29 UTC) #25

Jon Sonesen

On 2017/03/05 21:00:29, juliandoucette wrote: > # On one hand > > String values by ...

March 6, 2017, 10:16 a.m. (2017-03-06 10:16:18 UTC) #26

juliandoucette

> I like this idea, perhaps it could be an optional dir in the site ...

March 6, 2017, 6:22 p.m. (2017-03-06 18:22:17 UTC) #27

Jon Sonesen

On 2017/03/06 18:22:17, juliandoucette wrote: > > I like this idea, perhaps it could be ...

March 7, 2017, 10:29 a.m. (2017-03-07 10:29:38 UTC) #28

Vasily Kuznetsov

On 2017/03/07 10:29:38, Jon Sonesen wrote: > On 2017/03/06 18:22:17, juliandoucette wrote: > > > ...

March 7, 2017, 11:19 a.m. (2017-03-07 11:19:43 UTC) #29

juliandoucette

Let's go with option 2 for now, given that this is a global function and ...

March 7, 2017, 11:48 a.m. (2017-03-07 11:48:38 UTC) #30

Jon Sonesen

On 2017/03/07 11:48:38, juliandoucette wrote: > Let's go with option 2 for now, given that ...

March 7, 2017, 12:37 p.m. (2017-03-07 12:37:01 UTC) #31

juliandoucette

> Great, I will do this. One question, are tags with commas something you want? ...

March 7, 2017, 1:36 p.m. (2017-03-07 13:36:43 UTC) #32

Jon Sonesen

On 2017/03/07 13:36:43, juliandoucette wrote: > > Great, I will do this. One question, are ...

March 8, 2017, 4:17 p.m. (2017-03-08 16:17:27 UTC) #33

Vasily Kuznetsov

Hey Jon, I have a couple of nits about whitespace handling and being more careful ...

March 8, 2017, 5:07 p.m. (2017-03-08 17:07:42 UTC) #34

Jon Sonesen

Thanks for pointing out those things :) https://codereview.adblockplus.org/29370597/diff/29378682/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29378682/tests/test_site/globals/get_pages_metadata.py#newcode28 tests/test_site/globals/get_pages_metadata.py:28: value = ...

March 8, 2017, 5:47 p.m. (2017-03-08 17:47:17 UTC) #35

Vasily Kuznetsov

https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py#newcode30 tests/test_site/globals/get_pages_metadata.py:30: value = value[1:-1].strip().split(',') It should be the other way ...

March 9, 2017, 9:46 a.m. (2017-03-09 09:46:13 UTC) #36

Jon Sonesen

On 2017/03/09 09:46:13, Vasily Kuznetsov wrote: > https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py > File tests/test_site/globals/get_pages_metadata.py (right): > > https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py#newcode30 ...

March 9, 2017, 9:53 a.m. (2017-03-09 09:53:29 UTC) #37

Jon Sonesen

https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29378942/tests/test_site/globals/get_pages_metadata.py#newcode30 tests/test_site/globals/get_pages_metadata.py:30: value = value[1:-1].strip().split(',') On 2017/03/09 09:46:13, Vasily Kuznetsov wrote: ...

March 10, 2017, 9:17 a.m. (2017-03-10 09:17:26 UTC) #38

Vasily Kuznetsov

Hey Jon, I see you changed the matching algoritm in `filter_metadata` -- good catch, I ...

March 10, 2017, 10:23 a.m. (2017-03-10 10:23:38 UTC) #39

Jon Sonesen

Thanks for catching these :) https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/globals/get_pages_metadata.py File tests/test_site/globals/get_pages_metadata.py (right): https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/globals/get_pages_metadata.py#newcode41 tests/test_site/globals/get_pages_metadata.py:41: for option in filter_value.split(','): ...

March 10, 2017, 11:01 a.m. (2017-03-10 11:01:32 UTC) #40

Jon Sonesen

Just realized we will have the change the filter syntax to json see coment https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/globals/get_pages_metadata.py ...

March 10, 2017, 11:43 a.m. (2017-03-10 11:43:38 UTC) #41

Jon Sonesen

realized that later when we add json the filter syntax will have to change anyway.

March 10, 2017, 11:52 a.m. (2017-03-10 11:52:53 UTC) #42

Jon Sonesen

realized that later when we add json the filter syntax will have to change anyway.

March 10, 2017, 11:53 a.m. (2017-03-10 11:53:55 UTC) #43

Vasily Kuznetsov

Hey Jon, I like the change in the API towards accepting lists for filter values ...

March 15, 2017, 6:02 p.m. (2017-03-15 18:02:58 UTC) #44

Hey Jon,

I like the change in the API towards accepting lists for filter values instead
of splitting them by comma. This is more unambiguous and more robust. However,
there are still a couple of corner cases where it seems current logic would
break.

I guess we should have added some more proper tests for this from the start,
otherwise hard to keep all the different possibilities in the head when changing
`filter_metadata`.

Sorry for being picky ;)

Cheers,
Vasily

https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:41: for option in
filter_value.split(','):
On 2017/03/10 11:43:38, Jon Sonesen wrote:
> On 2017/03/10 11:01:32, Jon Sonesen wrote:
> > On 2017/03/10 10:23:38, Vasily Kuznetsov wrote:
> > > We should probably only split if the `metadata[filter_name]` is a list.
> > > Otherwise if we have a date field and want to match by it, it won't work.
> > 
> > I guess I thought that this didnt matter since julian said they wont need
> dates
> > or any strings with commas inside the config fields. However I understand
what
> > you mean here as once we move to json compatibility this will be a concern,
I
> > will address this.
> 
> Actually in this case we will have to change the filter syntax to just accept
> json as well, this way you can filter on values which have commas currently it
> is like this filter={'tags':'popular, bar'} which is nice and simple but also
> lacks the robust-yness required to filter based on values once we add json
> support. 

Yeah, this is better. Splitting the filter values was error-prone.

https://codereview.adblockplus.org/29370597/diff/29379591/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:45: return True
On 2017/03/10 11:01:32, Jon Sonesen wrote:
> On 2017/03/10 10:23:38, Vasily Kuznetsov wrote:
> > But if you return `True` here, would not this ignore the following filters?
> 
> I thought we were doing an 'or' selection so if one of the values matches the
> page is added. If this is wrong I will change it.

Yes, the docstring of the function in the issue says "A list of metadata dicts
for pages matching all the filters". BTW, perhaps it makes sense to include the
docstring into the patch here as well.

https://codereview.adblockplus.org/29370597/diff/29379775/tests/test_site/glo...
File tests/test_site/globals/get_pages_metadata.py (right):

https://codereview.adblockplus.org/29370597/diff/29379775/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:42: for option in filter_value:
How about the situation when the field is a list but the filter value is a
string? For example if tags=[foo, bar] and filters={'tags': 'foo'}. I would say
the page should be included but when it iterates over 'foo' (as opposed to
['foo']), it will not find 'f' in ['foo', 'bar'] and so False will be returned.

If you agree that get_pages_metadata(filters={'tags': 'foo'}) should return a
page with tags=[foo, bar] in the header, perhaps the solution here would be to
check if `filter_value` is an instance of `basestring` and convert it to a
one-item list in this case.

https://codereview.adblockplus.org/29370597/diff/29379775/tests/test_site/glo...
tests/test_site/globals/get_pages_metadata.py:45: if filter_value !=
metadata[filter_name]:
Should not this be `elif` instead of `if`. It seems like executing this for
listy fields would create false negatives. For example if we have a page that
has tags=[foo, bar] and then we do a search with the filters={'tags': ['bar',
'foo']}, this branch would trigger and return False (and I don't think it makes
sense to make tags order-dependent).

Jon Sonesen

Hey Vasily, Thank you for catching those things, my bad for the oversight. I will ...

March 17, 2017, 7:53 a.m. (2017-03-17 07:53:34 UTC) #45

juliandoucette

On 2017/03/17 10:35:38, Vasily Kuznetsov wrote: > LGTM What are our plans around integrating this ...

March 28, 2017, 4:56 p.m. (2017-03-28 16:56:42 UTC) #47

Vasily Kuznetsov

On 2017/03/28 16:56:42, juliandoucette wrote: > On 2017/03/17 10:35:38, Vasily Kuznetsov wrote: > > LGTM ...

March 28, 2017, 6:21 p.m. (2017-03-28 18:21:59 UTC) #48

juliandoucette

> I would say if you're happy with how it works and you would like ...

April 5, 2017, 11:49 a.m. (2017-04-05 11:49:35 UTC) #49

juliandoucette

I'm probably going to use this to complete #5043 on acceptableads.com. I think this warrants ...

June 16, 2017, 7:44 p.m. (2017-06-16 19:44:14 UTC) #50

Jon Sonesen

June 23, 2017, 9:07 a.m. (2017-06-23 09:07:26 UTC) #51

Message was sent while issue was closed.

On 2017/06/16 19:44:14, juliandoucette wrote:
> I'm probably going to use this to complete #5043 on http://acceptableads.com.
I think
> this warrants implementing this function in adblockplus/cms.

Sure, closing this review and opening another for the merge into a default
global

Issue 29370597: Issue 4687 - Add Context Function get_pages_metadata to Test Site (Closed)

Description

Patch Set 1 #

Patch Set 2 : addresses read_page usage, updates return data to include page key value #

Patch Set 3 : remove per line page assignment fix argument and data access errors #

Patch Set 4 : #

Patch Set 5 : #

Patch Set 6 : #

Patch Set 7 : realized that later when we add json the filter syntax will have to change anyway. #

Patch Set 8 : #

Messages