User talk:Citation bot/Archive 8

Latest comment: 6 years ago by AManWithNoPlan in topic Academia.edu links
Archive 5Archive 6Archive 7Archive 8Archive 9Archive 10Archive 15

Updated release now published

With apologies for the delay, I've now finished wrangling with various new credential protocols and have pulled the latest version of the bot -- with many long-anticipated bug fixes -- onto the production site. Hopefully this will work for all with no glitches, but being realistic, please do raise any issues either here as usual, or (if the issue relates to the implementation, i.e. the service being unavailable) try raising a GitHub issue, which may catch my attention more punctually. Please do let me know how yous all get on! In particular, if a reported bug is now fixed, please do mark it as such by setting its status to {{fixed}}. Martin (Smith609 – Talk) 07:09, 23 July 2018 (UTC)

@Smith609: Many thanks! Headbomb {t · c · p · b} 12:45, 24 July 2018 (UTC)

{{notabug}} we have already moved on. Flag for archiving. AManWithNoPlan (talk) 17:28, 24 July 2018 (UTC)

Bot replaced translator-first with unrecognized and incorrect parameter

Status
new bug
Reported by
(tJosve05a (c) 20:26, 23 July 2018 (UTC)
Type of bug
Deleterious
What happens
The bot replaced |translator-first= and |translator-last= with |inventor-first= and |inventor-last=, which isn't recognized by {{cite book}}, nor is correct in this situation.
What should happen
The bot should not replace human added |translator-first= and |translator-last= with other parameters
Relevant diffs/links
Special:Diff/851668101&oldid=851545607
Replication instructions
Run bot on England
We can't proceed until
Bot operator's feedback on what is feasible


This is because the citation template people add parameters like candy. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 23:08, 23 July 2018 (UTC)
Should probably double check Module:Citation/CS1/Whitelist/sandbox against the list in the code. --Izno (talk) 00:50, 24 July 2018 (UTC)
Added a bunch more. AManWithNoPlan (talk) 03:37, 24 July 2018 (UTC)

{{fixed}}

vauthors replaced with deprecated authors

Status
new bug
Reported by
Boghog (talk) 06:05, 24 July 2018 (UTC)
Type of bug
Deleterious
What happens
|vauthors= is replaced with |authors=
What should happen
Should not touch |vauthors=
Relevant diffs/links
diff
Replication instructions
Run bot on Antioxidant
We can't proceed until
Bot operator's feedback on what is feasible


The long supported |vauthors= produces clean metadata while the deprecated |authors= does not. Boghog (talk) 06:05, 24 July 2018 (UTC)

I note a recent discussion where this behavior was mentioned with a question about whether this is the desired behavior. Boghog (talk) 06:29, 24 July 2018 (UTC)

It think will fix it https://github.com/ms609/citation-bot/pull/428. AManWithNoPlan (talk) 12:32, 24 July 2018 (UTC)

{{fixed}}

Bot should not replace access-date and dead-url

Status
{{fixed}}
Reported by
(tJosve05a (c) 07:18, 24 July 2018 (UTC)
Type of bug
Cosmetic
What happens
The bot replaces |access-date= and |dead-url= with |accessdate= and |deadurl=. Both are accpeted - however, access-date and dead-url are prefered per template documentation.
What should happen
The bot should not replace parameters with other parameters with/without a hyphen.
Relevant diffs/links
Special:Diff/851731112&oldid=828984182
Replication instructions
Run the bot on a page with |access-date= in {{cite web}}
We can't proceed until
Bot operator's feedback on what is feasible


The templates have added so many things. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:41, 24 July 2018 (UTC)

The RFC on hyphenated parameter names was four years ago. – Jonesey95 (talk) 15:44, 24 July 2018 (UTC)


hdl-access

Status
{{fixed}}
Reported by
(tJosve05a (c) 10:54, 24 July 2018 (UTC)
Type of bug
Deleterious
What happens
The bot replaces |doi-access= with |hdl-access= for no reason
What should happen
Do not replace accceptable parameters with content, without guarantee that it is not causing an error
Relevant diffs/links
Special:Diff/851749497&oldid=851648489
Replication instructions
Run the bot on Reptile
We can't proceed until
Bot operator's feedback on what is feasible


See also this edit where the bot replaced |url-access= with |hdl-access=.

Trappist the monk (talk) 11:18, 24 July 2018 (UTC)


Added to white list https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:39, 24 July 2018 (UTC)

Support for new parameters

User:AManWithNoPlan has kindly added new parameters to the bot's dictionary. I've pulled through this update now, so hopefully replacement of unrecognized parameters will no longer be an issue. Martin (Smith609 – Talk) 16:29, 24 July 2018 (UTC)

{{fixed}} flagged for archiving.

Redandant europepmc.org URLs added

Status
new bug
Reported by
Boghog (talk) 06:05, 24 July 2018 (UTC)
Type of bug
Inconvenience
What happens
Redundant europepmc.org URLs are added to templates containing |pmc=
What should happen
should not add redundant URLs
Relevant diffs/links
diff
Replication instructions
Run bot on Antioxidant
We can't proceed until
Bot operator's feedback on what is feasible


Europe PubMed Central is a mirror of PubMed Central. |pmc= links the title of the article to the relevant page on PubMed Central. Adding the redundant |url= replaces the already linked title with a link to a mirror site. Boghog (talk) 06:21, 24 July 2018 (UTC)

{{fixed}} https://github.com/ms609/citation-bot/pull/430 AManWithNoPlan (talk) 15:24, 24 July 2018 (UTC)

Should recognize HDL

Also in that same edit, handle system has its own cs1|2 parameter: |hdl=; instead of:
|url=http://hdl.handle.net/10397/34754
write:
|hdl=10397/34754

diff


https://github.com/ms609/citation-bot/pull/433 AManWithNoPlan (talk) 17:26, 24 July 2018 (UTC)


{{fixed}}

Bot adds arxiv urls rather than use the arxiv parameter

Status
new bug
Reported by
Headbomb {t · c · p · b} 13:10, 24 July 2018 (UTC)
Type of bug
Inconvenience
What happens
[1]
What should happen
No arxiv urls added. Use |arxiv= for this.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=LOBPCG&diff=prev&oldid=851764808
Replication instructions
Run on LOBPCG
We can't proceed until
Agreement on the best solution
Requested action from maintainer
Fix this.


https://github.com/ms609/citation-bot/pull/430 AManWithNoPlan (talk) 17:00, 24 July 2018 (UTC)


{{fixed}}

bot replaced |doi-access= with |hdl-access=

Status
{{Fixed}}
Reported by
Trappist the monk (talk) 10:49, 24 July 2018 (UTC)
Relevant diffs/links
this edit
We can't proceed until
Agreement on the best solution


Added to white list. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:36, 24 July 2018 (UTC)

Improperly adds journal to citation template with contribution/title/series parameters

Status
new bug
Reported by
David Eppstein (talk) 05:11, 21 June 2018 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
What happens
Some academic journals are also simultaneously book series. When a citation is made to a book in such a series using the citation template with the contribution/title/series parameters (for the title of the paper, title of the book, and title of the series) it is incorrect to add a duplicate journal parameter with the same value as the series. This creates a faulty citation, because the citation template does not allow both contribution and title in citations with nonempty journal parameters, and also because the series parameter means something different in citations with a journal. In the linked case, the citation was already correct as it stands. It would also work to use title/department/journal instead of contribution/title/series, but the bot's choice of contribution/title/journal is just broken.
Relevant diffs/links
Special:Diff/846802069
We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/435 AManWithNoPlan (talk) 20:48, 24 July 2018 (UTC)

{{fixed}}

Google Books

Linking a title to Googlebooks is ok when the link leads to a preview; when it doesn't, as in this:
|url=https://books.google.com/books?id=IY-YtwEACAAJ
it is better to omit |url= so that user expectation (that the citation title links to a source that can be read) is not confounded; users can get to Googlebooks through |isbn=978-3-527-30673-2 and its link through Special:BookSources.
Trappist the monk (talk) 10:12, 24 July 2018 (UTC)

diff

Is there a cross-Wikipedia consensus on this? I can see editors becoming upset if links that they have added are removed by an automatic process. Martin (Smith609 – Talk) 16:35, 24 July 2018 (UTC)
There is no consensus to remove urls to google books information page. However, the bot should not add the links to all cite books without a url either. (tJosve05a (c) 16:40, 24 July 2018 (UTC)


Why does the bot suddenly add links to google books out of nowhere? That should not be done. Headbomb {t · c · p · b} 11:42, 24 July 2018 (UTC)
I have created a pull request. https://github.com/ms609/citation-bot/pull/431 Probably a good idea until we almost all agree and until we verify that the hundred other link types do not exist AManWithNoPlan (talk) 17:06, 24 July 2018 (UTC)
I'd consider this one an urgent fix now thing so if that could be deployed that would be great. Everything else is relatively minor, but Quark, for instance required massive cleanup because of this [2]. Headbomb {t · c · p · b} 18:59, 24 July 2018 (UTC)

Adding |url= when the cs1|2 template has |title-link= will produce the same undesirable results. I have not seen this, but when fixing this bug, you might check to make sure that the bot does not add |url= when |title-link= is set.

Trappist the monk (talk) 10:21, 24 July 2018 (UTC)

I assume Smith is sleeping right now. I know his and my time zone are not the same! AManWithNoPlan (talk) 23:30, 24 July 2018 (UTC)
Deployed. Martin (Smith609 – Talk) 06:10, 25 July 2018 (UTC)
Either the deploy failed or the issue is not resolved correctly. This bot edit, three hours after the above deployment notice, adds superfluous google books links; one of which broke an existing citation template.
Trappist the monk (talk) 09:58, 25 July 2018 (UTC)
And another.
Trappist the monk (talk) 12:02, 25 July 2018 (UTC)


I will look at again. It worked for my test cases but not these. Half fixed but still broke. AManWithNoPlan (talk) 12:57, 25 July 2018 (UTC)

Found other case https://github.com/ms609/citation-bot/pull/438 AManWithNoPlan (talk) 14:04, 25 July 2018 (UTC)


{{fixed}}

diff This edit added a link to http://digitallibrary.amnh.org/bitstream/handle/2246/5906/v3/dspace/updateIngest/pdfs/N3610.pdf%3Bjsessionid%3D23866600E2892FD54861C9246EBA1DBB?sequence%3D1 which was dead. (tJosve05a (c) 14:40, 25 July 2018 (UTC)

that does suck that the author of the journal article explicitly tells us to use a dead URL AManWithNoPlan (talk) 16:03, 25 July 2018 (UTC)
I will look into adding some code to test the url AManWithNoPlan (talk) 16:32, 25 July 2018 (UTC)

https://github.com/ms609/citation-bot/pull/440 AManWithNoPlan (talk) 04:50, 26 July 2018 (UTC)

{{fixed}}

When converting cite arxiv to cite journal, update the year/date

Status
new bug
Reported by
Headbomb {t · c · p · b} 13:19, 24 July 2018 (UTC)
Type of bug
Inconvenience
What happens
When converting a cite arxiv to a cite journal, the bot keeps the original date
What should happen
The bot should use the date as can be determined via bibcode/doi/pmids/other versions of records
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=LOBPCG&diff=prev&oldid=851764808
We can't proceed until
Agreement on the best solution


How about this: https://github.com/ms609/citation-bot/pull/436/ AManWithNoPlan (talk) 20:44, 24 July 2018 (UTC)

@AManWithNoPlan: not sure what that does exactly, but the net should be cast as wide as possible for anything that trigger an upgrade from cite arxiv to cite journal/cite conference/cite book (ISBN, Bibcodes, PMID, PMC, etc... if those apply) Headbomb {t · c · p · b} 21:25, 24 July 2018 (UTC)
It catches all cite webs and cite arxiv that do not already have a doi. AManWithNoPlan (talk) 23:28, 24 July 2018 (UTC)
@AManWithNoPlan: what happens if the preprint is published, but without a doi but other identifiers, like bibcodes? Headbomb {t · c · p · b} 12:56, 25 July 2018 (UTC)
Not sure. Do you have an example to test. I think that you have to go through the DOI database first. AManWithNoPlan (talk) 13:25, 25 July 2018 (UTC)

Here's possibly a case

  • Arnold, Douglas N.; Fowler, Kristine K. (2011). "Nefarious Numbers". Notices of the American Mathematical Society. 58 (3): 434–437. arXiv:1010.0278. Bibcode:2010arXiv1010.0278A.
  • arXiv:1010.0278 says it's published in "Notices Amer. Math. Soc., 58(3):434-437, 2011" The metadata is poor, and the upgrade from arxiv to journal is messy [3], but it's an example of where it could be done in theory. There are better examples out there, with better metadata, so I'll keep looking for those. Headbomb {t · c · p · b} 13:43, 25 July 2018 (UTC)
Neither one of those cases has a DOI to be found using the ARXIV database AManWithNoPlan (talk) 15:26, 25 July 2018 (UTC)

{{fixed}} code merged

Some DOI data is junk

Status
new bug
Reported by
65.94.42.168 (talk) 05:33, 25 July 2018 (UTC)
Type of bug
Deleterious: Human-input data is deleted or articles are otherwise significantly affected.
What happens
BOT assisted edit at M32p deleted the journal article name and replaced it with a nonsense journal article name, deleted the authors, deleted the journal volume, issue, publication date
What should happen
It should have been an author correction; the information for the publication journal date, volume, issue, etc is available via http://adsabs.harvard.edu/abs/2018MNRAS.475.2754H

I suggest that the bot crosscheck PMID, arXiv and bibcode against the DOI to see if the DOi is faulty. If all other uses match against each other, and the DOI doesn't then the DOI is in error.

Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=M32p&action=historysubmit&type=revision&diff=851774001&oldid=851769468
We can't proceed until
Agreement on the best solution


I think this is very local to OUP manuscripts, and it's probably just simpler to check that the DOI info does not resolve to a pre-production placeholder thing. Headbomb {t · c · p · b} 13:00, 25 July 2018 (UTC)

https://github.com/ms609/citation-bot/pull/439/files AManWithNoPlan (talk) 14:03, 25 July 2018 (UTC)

Just to clarify. I deleted the title and the authors, everything in fact, since it was poorly-formatted and generating CS1 errors. Then used the bot to recreate the citation. So the bot didn't do anything too radical like overwriting good info with bad, but it did pick up the wrong title as described. Lithopsian (talk) 20:03, 25 July 2018 (UTC)

{{fixed}} we will add more checking as more oddities are found AManWithNoPlan (talk) 12:49, 26 July 2018 (UTC)

Bot moves parameters for no reason

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 19:29, 26 July 2018 (UTC)
Type of bug
Cosmetic
What happens
The bot takes existing parameters and put them in new locations
What should happen
Leave things where they are
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Quark&diff=prev&oldid=852114776
We can't proceed until
Agreement on the best solution


This is because of the new code that allows DOI information to override Arxiv information. I know how to fix this. The citation forgets and the remembers the year. I need to change it to a placeholder and the change it back or delete it AManWithNoPlan (talk) 19:48, 26 July 2018 (UTC)

https://github.com/ms609/citation-bot/pull/443 AManWithNoPlan (talk) 20:08, 26 July 2018 (UTC)

Converts empty coauthors into empty vauthors

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 02:23, 27 July 2018 (UTC)
Type of bug
Improvement
What happens
The bot converts |coauthors= to |vauthors=
What should happen
If |coauthors= is non-empty, leave it alone. If |coauthors= is empty, remove it.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Biological_neuron_model&diff=prev&oldid=852162733
We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/445 typo fixing is hard AManWithNoPlan (talk) 02:45, 27 July 2018 (UTC)

Bot converts orig-year to origyear

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 03:48, 27 July 2018 (UTC)
What happens
Bot converts |orig-year= to |origyear=
What should happen
Leave it alone (or convert |origyear= to |orig-year=), since |orig-year= is the canonical use.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Absolute_zero&diff=prev&oldid=852171128
We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/445/files#diff-bb37b1a3125b1a812ed46c7dfdccae3b Added

https://github.com/ms609/citation-bot/pull/446 Check out this. Adding code to automatically generate the 1-99 stuff and added ability to split out parameters that should be recognized but not changed to AManWithNoPlan (talk) 04:40, 27 July 2018 (UTC)

cite web handling improvements

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 13:14, 27 July 2018 (UTC)
What happens
Does not convert cite web to cite journal when pmc is set
What should happen
convert and improve cite web to cite journal when it makes sense to do so
Relevant diffs/links
When running on [4] it misses one improvement. When I convert a cite web (with pmc) to cite journal [5], the bot can then kick in on that cite [6].
We can't proceed until
Agreement on the best solution


That’s been wrong forever. Good catch. Also pmid too. https://github.com/ms609/citation-bot/pull/447 AManWithNoPlan (talk) 13:37, 27 July 2018 (UTC)

Do not remove the publisher

Status
Reported by
(tJosve05a (c) 06:46, 24 July 2018 (UTC)
Type of bug
Deleterious
What happens
The bot removes all |publisher= in {{cite journal}}
What should happen
It should not remove human inputted fields.
Relevant diffs/links
Special:Diff/851728907&oldid=833759254
Replication instructions
Run the bot on Paul Ashbee
We can't proceed until
Agreement on the best solution


Personally I love the new functionality. I'll be very sad to see it go. Headbomb {t · c · p · b} 14:27, 28 July 2018 (UTC)

@Headbomb: You want the bot to remove publisher fields from the citation if manually provided? Why? (tJosve05a (c) 21:02, 28 July 2018 (UTC)
This is NOT a new feature, it has been highly regarded for a long time. People seem to think that providing a published is too much information. Also, that changes over time and is generally not useful. I have written the code, but it is not in because of lack of agreement.

https://github.com/ms609/citation-bot/pull/432 AManWithNoPlan (talk) 22:39, 28 July 2018 (UTC)

Well, it is a manually entered field, and the cite template had been changed to allow for both journal and publisher now, so consusnss over at the template's talk page seem to be to allow bot fields. (tJosve05a (c) 09:27, 29 July 2018 (UTC)


{{notabug}}

redundant page range

Status
new bug
Reported by
Headbomb {t · c · p · b} 13:45, 27 July 2018 (UTC)
Type of bug
Improvement
What happens
leaves the citation as {{cite book ... |pages=23–23 ...}}
What should happen
should convert to

{{cite book ... |page=23 ...}}

Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Marie_Curie&diff=prev&oldid=852228914
We can't proceed until
Agreement on the best solution
Requested action from maintainer
write code


This should also be reported to the CS1 people too so they can have the templates do this just like they convert dashes? AManWithNoPlan (talk) 13:56, 27 July 2018 (UTC)

https://github.com/ms609/citation-bot/pull/454 Does it to years & pages but non issues. AManWithNoPlan (talk) 02:29, 29 July 2018 (UTC)

Should do it to issues too. I'll post a notice at Help talk:CS1 too. Headbomb {t · c · p · b} 14:16, 30 July 2018 (UTC)
Issues added. AManWithNoPlan (talk) 14:47, 30 July 2018 (UTC)

{{fixed}}

Leave journal capitalization after : alone

Status
new bug
Reported by
Headbomb {t · c · p · b} 02:20, 28 July 2018 (UTC)
Type of bug
Inconvenience/Cosmetic
What happens
converts |journal=Historical Biology: An International Journal of Paleobiology to |journal=Historical Biology: an International Journal of Paleobiology
What should happen
leave : An as is
Relevant diffs/links
[7]
We can't proceed until
Agreement on the best solution


This is always an ongoing battle of styles. Added this one: https://github.com/ms609/citation-bot/pull/448 and a more generic fix https://github.com/ms609/citation-bot/pull/453 AManWithNoPlan (talk) 02:36, 29 July 2018 (UTC)

{{fixed}}

Status
new bug
Reported by
Headbomb {t · c · p · b} 02:25, 28 July 2018 (UTC)
Type of bug
Improvement
What happens
If an amazon link is given and |isbn= exists, the amazon link is converted to |asin=
What should happen
If an amazon link is given and |isbn= exists, the amazon link is removed
Relevant diffs/links
[8]
We can't proceed until
Agreement on the best solution
Requested action from maintainer
code added


The bot is makeing the page better, but you are right it could do more ; especially if the ASIN is an ISBN AManWithNoPlan (talk) 02:32, 28 July 2018 (UTC)

It's better yes, but then another edit needs to be made (User:CitationCleanerBot will cleanup what it can every now and then). The bot should also remove asin when isbn is present in general, the link-->asin is just an intermediate step. Headbomb {t · c · p · b} 02:53, 28 July 2018 (UTC)
It seems to me that perhaps only if the asin is the same as the isbn. AManWithNoPlan (talk) 02:34, 29 July 2018 (UTC)
It should straight up be removed. ASIN / amazon links should only be used when there's nothing else. See Help:CS1#Identifiers, ASIN section, or CitationCleanerBot 3. Headbomb {t · c · p · b} 04:27, 30 July 2018 (UTC)
But my retirement savings are all invested in Amazon Stock!!!!. Just joking. https://github.com/ms609/citation-bot/pull/468 AManWithNoPlan (talk) 15:25, 30 July 2018 (UTC)
A few subtilities here. Links with ASINs starting with letters / ASINs starting with letters should also be removed when ISBNs exist, or converted to |ASIN= when no ISBNs are set. If there is no ISBN, ASIN staring with numbers should be converted to ISBNs when possible (however those starting with |asin=630... aren't ISBNs). Headbomb {t · c · p · b} 15:44, 30 July 2018 (UTC)
I updated the code. If there is an ISBN, then ignore ASIN. If the ASIN is an ISBN then add as ISBN, if not then add as ASIN. AManWithNoPlan (talk) 17:32, 30 July 2018 (UTC)
That doesn't sound right. I think it should be: if there is an ISBN or OCLC, remove the ASIN. If there is no ISBN and the ASIN starts with a letter or 630, leave the ASIN alone. If there is no ISBN and the ASIN is a valid ISBN, move the ASIN to |ISBN=. – Jonesey95 (talk) 17:40, 30 July 2018 (UTC)
It looks like it is all good now. AManWithNoPlan (talk) 19:32, 30 July 2018 (UTC)
Do we know for certain that 630-series numbers are not isbns? Have the isbn people given that series over to amazon? If there is some sort of official acknowledgement that 630-series numbers are not isbns (even though they validate as isbn numbers) then perhaps cs1|2 should stop adding articles to Category:CS1 maint: ASIN uses ISBN when |asin= holds a 630-series number. Similarly, the documentation for |asin= should be updated to recognize the 630 series.
Trappist the monk (talk) 11:08, 31 July 2018 (UTC)
Not that I'm aware. Doesn't mean that such a thing doesn't exist though, just that I never found it. There is List of ISBN identifier groups, however.Headbomb {t · c · p · b} 11:25, 31 July 2018 (UTC)
Plug into https://www.isbn.org/ISBN_converter the ASIN 6303007759 and see that it is invalid. AManWithNoPlan (talk) 15:19, 31 July 2018 (UTC)

{{fixed}}

Don't capitalized "De" / Capitalize FASEB

Status
new bug
Reported by
Headbomb {t · c · p · b} 04:29, 4 August 2018 (UTC)
Type of bug
improvement
What happens
Bot capitalizes "De"
What should happen
should be "de"
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Regulation_of_electronic_cigarettes&diff=prev&oldid=853352540
We can't proceed until
Agreement on the best solution
Requested action from maintainer
add more to the list


Status
new bug
Reported by
Headbomb {t · c · p · b} 04:48, 4 August 2018 (UTC)
Type of bug
improvement
What happens
Faseb
What should happen
FASEB
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=W._Mark_Saltzman&diff=853353920&oldid=853353896
We can't proceed until
Agreement on the best solution
Requested action from maintainer
add more to the list


Gonna anticipate a few more here

Uppercase

  • AJHG
  • BBA
  • BMC
  • BMJ
  • DNA
  • EMBO
  • FASEB
  • FEBS
  • FEMS
  • JAMA
  • MNRAS
  • NEJM
  • NYT
  • PCR
  • PLOS/PLoS
  • PNAS
  • UK
  • USA

Lowercase (but first-letter capital allowed after a . or :)

  • a
  • an
  • el
  • de
  • la
  • le
  • für
  • of
  • on
  • the
  • van
  • von

Some of the lowercase ones can be confused with abbreviations/other words. Headbomb {t · c · p · b} 05:08, 4 August 2018 (UTC)

Upon further review, I think one of the main issues is when the journal is wikilinked, the bot goes cray with capitalization. Headbomb {t · c · p · b} 06:01, 4 August 2018 (UTC)
Do you have an example of Wikilinks? We do not touch those. I really wish the databases we query actually formatted the titles right. AManWithNoPlan (talk) 13:15, 4 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/504 AManWithNoPlan (talk) 14:58, 4 August 2018 (UTC)

{{fixed}}

Examples of wikilinks: [9] (at the very bottom) and [10] (look for Agricultural and Forest Meteorology and Proceedings of the National Academy of Sciences of the USA). Headbomb {t · c · p · b} 15:56, 4 August 2018 (UTC)

that is a regression. https://github.com/ms609/citation-bot/pull/506 AManWithNoPlan (talk) 16:13, 4 August 2018 (UTC)
the above also covers links too. AManWithNoPlan (talk) 20:25, 4 August 2018 (UTC)

{{fixed}}

bot broke citation template by leaving |work= in the template

Status
{{fixed}} enough
Reported by
Trappist the monk (talk) 14:53, 2 August 2018 (UTC)
Type of bug
Inconvenience
What happens
With this edit, citation bot converted this somewhat correct template:
{{Citation|title=Reauthorizing the Elementary and Secondary Education Act|url=https://dx.doi.org/10.1057/9781137030931.0011|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|access-date=2018-07-09}}
"Reauthorizing the Elementary and Secondary Education Act", President Obama and Education Reform, Palgrave Macmillan, ISBN 9781137030931, retrieved 2018-07-09

to this broken template:

{{Citation|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|doi=10.1057/9781137030931.0011|chapter=Reauthorizing the Elementary and Secondary Education Act|title = President Obama and Education Reform|year = 2012}}
"President Obama and Education Reform", President Obama and Education Reform, Palgrave Macmillan, 2012, doi:10.1057/9781137030931.0011, ISBN 9781137030931 {{citation}}: |chapter= ignored (help)

The bot should have removed |work= when it added |chapter= because |work= (and its alias) is the mechanism that switches {{citation}} from 'book style' to 'periodical style'.

We can't proceed until
Agreement on the best solution


Perhaps just delete |work= when empty or when has chapter and work is equal to series, journal, title, chapter, or publisher. AManWithNoPlan (talk) 16:39, 2 August 2018 (UTC)

Another option is to change to {{cite book}} AManWithNoPlan (talk) 16:46, 2 August 2018 (UTC)
But in this case, |work= wasn't empty ...
Changing to {{cite book}} wouldn't fix the problem for two reasons:
  1. the bot created a new |title= by copying content from |work= and retained |work= so now we have redundant information in the rendered citation:
    {{Cite book|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|doi=10.1057/9781137030931.0011|chapter=Reauthorizing the Elementary and Secondary Education Act|title = President Obama and Education Reform|year = 2012}}
    "Reauthorizing the Elementary and Secondary Education Act". President Obama and Education Reform. Palgrave Macmillan. 2012. doi:10.1057/9781137030931.0011. ISBN 9781137030931. {{cite book}}: |work= ignored (help)
  2. style change from cs2 to cs1; and if there were short-form references depending on the automatic CITEREF links created by {{citation}}, those links are now broken
Trappist the monk (talk) 17:02, 2 August 2018 (UTC)
Good points. The real problem is that citation templates have so many parameters that are almost the same but not the same. We cannot fix that. It seems that we could implement code that checks for |work= and if the new title/chapter/publisher/journal matches it then drop it. AManWithNoPlan (talk) 17:09, 2 August 2018 (UTC)
In cs1|2 the internal parameter is Periodical. Any of |journal=, |newspaper=, |magazine=, |work=, |website=, |periodical=, |encyclopedia=, |encyclopaedia=, |dictionary=, |mailinglist= are aliases that feed into that internal parameter so all of them generally act the same. Module:Citation/CS1 does look at the names that were used in the template source because for {{citation}} the name of the parameter gives a clue to how the citation should be rendered. For example, when the source for Periodical is |journal=, Module:Citation/CS1 knows to render |volume=, |issue=, and |page(s)= using academic journal style and to emit the journal style COinS metadata. {{citation}} balks at the combination of any Periodical parameter in the presence of any Chapter alias. In the example template, copying the content of a Periodical alias to |title= should blank the Periodical alias so that {{citation}} isn't confused.
Trappist the monk (talk) 00:12, 3 August 2018 (UTC)
just for the record is copying nothing: it just finds the same string again in its database search AManWithNoPlan (talk) 00:27, 3 August 2018 (UTC)
Just need some code that notices if work===title and such and the deletes work. Case insensitive of course. AManWithNoPlan (talk) 00:30, 3 August 2018 (UTC)
Really? What if work and title are off by one character because of a typo or whatever? If the bot is correcting a malformed citation, as it attempted to do in this example, and ends up with a configuration that is not supported then perhaps the correct response is to do nothing.
Trappist the monk (talk) 13:20, 3 August 2018 (UTC)
Not sure exactly what is best, but this is a good first step https://github.com/ms609/citation-bot/pull/507 AManWithNoPlan (talk) 00:33, 5 August 2018 (UTC)

bot added url for a different article

Status
{{fixed}}
Reported by
Trappist the monk (talk) 14:47, 6 August 2018 (UTC)
Relevant diffs/links
diff
We can't proceed until
Agreement on the best solution


I noticed this because the referenced edit caused a url–wikilink conflict error. The original template has an inappropriate wikilink in |title=:

{{cite journal | doi = 10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2 | last1 = Lamanna | first1 = M.C. | last2 = Martinez | first2 = R.D. | last3 = Smith | first3 = J.B. | year = 2002 | title = A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]". | url = | journal = Journal of Vertebrate Paleontology | volume = 22 | issue = 1| pages = 58–69 }}
Lamanna, M.C.; Martinez, R.D.; Smith, J.B. (2002). "A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of Patagonia"". Journal of Vertebrate Paleontology. 22 (1): 58–69. doi:10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2.

From that, the bot made this:

{{cite journal | doi = 10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2 | last1 = Lamanna | first1 = M.C. | last2 = Martinez | first2 = R.D. | last3 = Smith | first3 = J.B. | year = 2002 | title = A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]" | url = http://www.bioone.org/doi/pdf/10.4202/app.00132.2014| journal = Journal of Vertebrate Paleontology | volume = 22 | issue = 1| pages = 58–69 | format = Full text }}
Lamanna, M.C.; Martinez, R.D.; Smith, J.B. (2002). "A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]"" (Full text). Journal of Vertebrate Paleontology. 22 (1): 58–69. doi:10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2. {{cite journal}}: URL–wikilink conflict (help)

If you follow the doi you get to the article that matches the bibliographic data. If you follow the title-link you end up at a vaguely related article (they are both about abelisaurids) that does not match the bibliographic data.

The value in the original |title= is malformed: it has a wikilink (it shouldn't) and it has extraneous punctuation (the single unmatched double quote mark and a period – neither of which belong there). Still, the bot should not be adding a url when |title= is wikilinked either explicitly (has wikilink markup) or indirectly by |title-link=, or has wikilinks (which are almost always inappropriate). It could be argued that, for |title= parameters with single-word wikilink markup, the markup should be removed. More difficult to know what to do with wikilinks in the form [[target|label]] because this form of wikilink is commonly used when linking to sources at, for example, wikisource.

Trappist the monk (talk) 14:47, 6 August 2018 (UTC)

Bad link: That is bad data in the database, but I have improved the code and the specific example will not occur https://github.com/ms609/citation-bot/pull/512 AManWithNoPlan (talk) 16:27, 6 August 2018 (UTC)

caps again

Touches 'zu', 'des', 'aus', 'dem', 'del', 'dei', 'of', 'di', 'ed', 'du', 'de', 'dans', 'les', 'e'. Headbomb {t · c · p · b} 03:09, 7 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/516 AManWithNoPlan (talk) 16:00, 7 August 2018 (UTC)

{{fixed}}

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 20:34, 7 August 2018 (UTC)
What happens
Adds links to handle.net
What should happen
use |hdl= instead
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Fermi_paradox&diff=prev&oldid=853920938
We can't proceed until
Agreement on the best solution


The problem is that it tries to add as a |hdl= and fails since it is already set. The solution is to view that as a success. This bug means that if you run the bot once you will get hdl set and then a second time it will add as a url. https://github.com/ms609/citation-bot/pull/517 AManWithNoPlan (talk) 21:43, 7 August 2018 (UTC)

You are working my butt off by the way. Which is good. AManWithNoPlan (talk) 21:43, 7 August 2018 (UTC)
Well it's the first time in a long looooooong while that anyone's been working on CitationBot so I'm making sure to take advantage of the opportunity. Headbomb {t · c · p · b} 22:25, 7 August 2018 (UTC)

.pdf at the end of dois

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 13:01, 9 August 2018 (UTC)
Type of bug
Inconvenience
What happens
bot adds |doi=10.1007/BF00428580.pdf based on |url=https://link.springer.com/content/pdf/10.1007/BF00428580.pdf
What should happen
Bot should be smart and strip .pdf at the end of dois.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Otto_Kandler&diff=prev&oldid=854171791
We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/523 AManWithNoPlan (talk) 21:10, 9 August 2018 (UTC)

Better cross-checking against NLM/NIH databases

Running CitationBot on doi:10.1073/pnas.171325998 finds PMC 58796, but not PMID 11573006. Basically, the bot should query both Pubmed and PubMed Central every possible ways up until each of doi/pmid/pmc are found. And iterate when new identifiers are found.

Pubmed
PubMed Central

of citation templates in the NLM/NIH databases, and cross-reference things with each other.

The bot should also not assume the queries return 'complete' results. Very often, a PMID entry won't list the PMC, even if a PMC exists and could be discoverable by a DOI query (and vice-versa for PMCs listing a DOI, but not a PMID, or a PMID, but not doi, or every other such combination). Headbomb {t · c · p · b} 04:42, 9 August 2018 (UTC)

I noticed that years ago. But, there were so many other issues to deal with that I forgot about it. AManWithNoPlan (talk) 14:11, 9 August 2018 (UTC)
they changed their xml output. https://github.com/ms609/citation-bot/pull/530 https://github.com/ms609/citation-bot/pull/533 AManWithNoPlan (talk) 22:04, 9 August 2018 (UTC)
They changed the DOI search method https://github.com/ms609/citation-bot/pull/534 This also includes tests so if they change it again we will see it. AManWithNoPlan (talk) 17:39, 10 August 2018 (UTC)


{{fixed}}

Status
  Fixed in GitHub Pull 526
Reported by
Headbomb {t · c · p · b} 13:09, 9 August 2018 (UTC)
Type of bug
Improvement
What happens
Bot adds |url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.752.4896
What should happen
Bot adds |citeseerx=10.1.1.752.4896
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Quantum_nonlocality&diff=prev&oldid=854173273
We can't proceed until
new code
Requested action from maintainer
new code


See also the hdl issue above. Headbomb {t · c · p · b} 13:10, 9 August 2018 (UTC)

completely unrelated to hdl issue. AManWithNoPlan (talk) 13:54, 9 August 2018 (UTC)
Seems exactly the same type of issue to me: failing to use |citeseerx=, just like it failed to use |hdl=, but you're the coder here. Headbomb {t · c · p · b} 14:05, 9 August 2018 (UTC)
The difference is that in the case of hdl, it already had the hdl set, so it failed to add it and then fell back on adding it as a url. In the case of the citeceers, the case of citeseerx, the bot has no code to even add one. AManWithNoPlan (talk) 14:11, 9 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/526 AManWithNoPlan (talk) 17:53, 10 August 2018 (UTC)

French words that have internal apostrophes

Touches 'l'', 'd''Headbomb {t · c · p · b} 03:09, 7 August 2018 (UTC)

I need to think about 'l'' and '' in words like d'Évaporation AManWithNoPlan (talk) 16:00, 7 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/538 and https://github.com/ms609/citation-bot/pull/539 AManWithNoPlan (talk) 23:06, 10 August 2018 (UTC)

HORRIBLE to fix, but {{fixed}} AManWithNoPlan (talk) 02:31, 12 August 2018 (UTC)

In cite journal, if work is set, publisher isn't removed, but if journal is set, publisher is removed

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 16:52, 11 August 2018 (UTC)
What happens
See diffs
What should happen
Skip intermediate diff
Relevant diffs/links
Chronologically, [13] [14] [15]
We can't proceed until
Agreement on the best solution


Work is such a poorly used parameter that removing published based upon it is dubious. I have added this code https://github.com/ms609/citation-bot/pull/545 so that if the |work= is set and the journal title happens to be the same, then the |work= is changed to |journal=. AManWithNoPlan (talk) 17:59, 11 August 2018 (UTC)

Hence why this should be restricted to the cite journal template since in that template work and journal are aliases. Headbomb {t · c · p · b} 18:48, 11 August 2018 (UTC)

wikilinked titles

Status
{{fixed}}
Reported by
Trappist the monk (talk) 14:47, 6 August 2018 (UTC)
Relevant diffs/links
diff
We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/525 AManWithNoPlan (talk) 21:09, 9 August 2018 (UTC)
@AManWithNoPlan: not sure I know what's being done in that exactly, but will this strip |journal=Journal of Foobar to |journal=Journal of Foobar? Because if so, it shouldn't. Headbomb {t · c · p · b} 21:42, 9 August 2018 (UTC)
it removes all wikilinks from |title=. It remove all wikilinks from |journal= UNLESS the link is the entire name of the journal. AManWithNoPlan (talk) 21:49, 9 August 2018 (UTC)
If you look at the changed files, one of them is a test suite and you can see the changes. AManWithNoPlan (talk) 22:59, 11 August 2018 (UTC)

The bot should trim ResearchGate links like

to the simpler

Headbomb {t · c · p · b} 13:31, 11 August 2018 (UTC)

And upgrade http to https AManWithNoPlan (talk) 18:00, 11 August 2018 (UTC)
https://github.com/ms609/citation-bot/pulls AManWithNoPlan (talk) 01:26, 13 August 2018 (UTC)
Along the same lines: https://github.com/ms609/citation-bot/pull/552 AManWithNoPlan (talk) 14:05, 13 August 2018 (UTC)

{{fixed}}

Physical Review E → Physical Review e

Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 18:07, 14 August 2018 (UTC)
What happens
Physical Review E → Physical Review e
What should happen
leave Physical Review E alone
Relevant diffs/links
[16]
We can't proceed until
Agreement on the best solution


This should apply to every single character at the end of a string, or before a ':'. E.g. Journal of Physics E: Blah BLah BLuh or Chemical Physics A. Headbomb {t · c · p · b} 18:07, 14 August 2018 (UTC)

https://github.com/ms609/citation-bot/pull/560 AManWithNoPlan (talk) 19:40, 14 August 2018 (UTC)
This happened when added support for the Spanish "the" word "e". That fixed a lot of Spanish things, but we forgot about "j chem phys e" type stuff. But come on, who splits their journals five ways? Obviously physics people do. AManWithNoPlan (talk) 19:49, 14 August 2018 (UTC)
That's because you haven't seen Proceedings of the Institution of Mechanical Engineers, parts A through P. Headbomb {t · c · p · b} 19:53, 14 August 2018 (UTC)
Those organic chemists just need part H, O, N, and C. :-) AManWithNoPlan (talk) 20:20, 14 August 2018 (UTC)
Status
{{fixed}}
Reported by
Headbomb {t · c · p · b} 21:54, 14 August 2018 (UTC)
Type of bug
Improvement: The bot would be much better if ...
What should happen
simplify

to

See also User_talk:Citation bot#Researchgate links

We can't proceed until
Agreement on the best solution


https://github.com/ms609/citation-bot/pull/564 Learned some things too. AManWithNoPlan (talk) 00:21, 15 August 2018 (UTC)