36:
CiteCompletion is a script that completes fields within citations to common
English-language news sites on the English Knowledge. It works by taking the news article URL from the Knowledge article page, looking up the news page and extracting the missing details of the news article based on per-site
933:
An alternative to CiteCompletion: CiteCompletion handles its supported sites more thoroughly than REFLINKS and can complete existing citations whereas REFLINKS offers all site support in a more generic way (normally does not detect authors etc.) but only for bare URLs (no completion of existing
442:
Publication dates are stripped of timestamps and days of the week and converted to the predominant format used in the
Knowledge article (International, American or ISO, falling back to ISO if there is no predominant
883:
Authors with multiple first names or multiple surnames are not supported (script cannot determine whether for 'Name
Anothername Surname' Anothername should be part of
173:
It does not modify non-templated manually formatted citations (because it cannot interpret the existing data so may overwrite user-set data).
866:
Not all fields are found from all supported sites. CiteCompletion will be improved over time to correctly extract more data.
553:
Otherwise, if no majority default to "2011-01-15" format (avoids accusation of any
American/International bias).
447:
The tidied up value is then appended to the citation. Values are not updated, they are only added if missing:
900:
The following are ideas that may or may not be implemented in CiteCompletion at some point in the future:
345:
fields is not specified. If one or more of the fields are missing, the HTML source of the URL is fetched.
960:
851:
Not all sites have rules for all fields (e.g. news.bbc.co.uk does not specify the article authors).
576:. The rules determine how to extract the template fields for each news site supported (e.g. for
541:
531:
848:
Where the derivation is a custom regular expression, the derivation starts with character '@'.
946:
250:
891:). Currently such authors are ignored; solutions for including them are under investigation.
176:
It has only been designed for use on the
English Knowledge; it may not work anywhere else.
8:
966:
970:
927:
845:
Where there are multiple derivations for the same field, these are separated by commas.
523:) is either "2011-01-15", or "15 January 2011" or "January 15, 2011". The decision is:
376:
353:
220:
136:
62:
51:
It operates only on sites that it has been specifically configured to work on, see the
939:
366:
240:
230:
202:
written in C#. In the future it may be made generally available as a Plugin for AWB.
82:
72:
45:
955:
Specialised for
Journal citations. Not an alternative to CiteCompletion as such.
873:
are supported. CiteCompletion will be improved over time to support more sites.
779:. This file is loaded into memory once per session. The format of the file is:
430:
All UPPERCASE or lowercase titles and author names are converted to Title Case.
199:
41:
17:
914:
Identify and flag news articles where registration/paid access is required.
563:
An edit summary is generated with counts of how many fields were completed.
349:
Where the citation matches but it is not templated, it is converted to use
930:– a citation insertion script that supports all sites in a generic way.
550:
Otherwise count existing date usage in article and use the majority one.
186:
CiteCompletion is fully compatible with the
Harvard referencing system.
166:
It does not handle non-English news sites, nor sites not listed in the
150:
for those within citation templates, still only for those sites on the
329:
on the
Knowledge article is assessed for a URL matching one of the
333:. If a match is found a check is made to see if one or more of the
58:
It can complete the following fields in citation templates such as
262:
Bare URLs with a bot generated title when within <ref> tags.
969:- a similar tool that scrapes popular sites in with jquery using
163:
It does not modify or update fields where they are already set.
942:– a citation completion script for Scientific Journal cites (
424:
Quotes are trimmed from titles (not quotes within the title).
408:
Custom regex (matching a span, heading or script value etc.).
502:
is set from the XML settings if relevant (first checks that
572:
For each supported site a set of rules are available in an
132:
It will also tag dead links if not already tagged with
189:
Authors/titles with accented characters are supported.
417:
When a match is found the source match is tidied up:
436:
Locations, job titles are removed from author names.
921:
439:Authors are split to "Lastname, Firstname" format.
271:CiteCompletion can complete the following fields:
963:– generates citations for The New York Times etc.
917:Allow community maintenance of XML settings file.
421:HTML-escaped characters are converted to Unicode.
895:
515:The date format used for inserted dates (both
427:Smart quotes are converted to straight quotes.
326:
157:
775:CiteCompletion uses an XML settings file of
266:
855:
210:
412:
216:Citation templates referencing a URL e.g.
205:
488:is set from the XML settings if relevant.
391:The HTML source is then parsed using the
31:
904:Release CiteCompletion as an AWB plugin.
402:HTML script numbered property (s.prop).
259:Bare URLs when within <ref> tags.
14:
198:CiteCompletion is a Custom module for
44:and normally run under the account of
870:
386:
330:
167:
151:
52:
776:
392:
320:
315:
23:
590:
573:
433:Newlines are replaced with spaces.
24:
982:
567:
395:. Supported parsing methods are:
770:
180:
922:Alternative & related tools
766:Others will be added over time.
405:HTML div id/span class/p class.
193:
510:
13:
1:
877:
557:
896:Possible future improvements
7:
860:
495:is set to the current date.
10:
987:
961:Knowledge:WikiCite Builder
482:etc. for multiple authors.
158:What CiteCompletion is not
26:
680:seattletimes.nwsource.com
267:Supported template fields
144:|bot=RjwilmsiBot
856:Issues & limitations
781:
362:Where the citation uses
327:Supported citation types
211:Supported citation types
585:OriginalPublicationDate
413:Insert parameter values
372:it is converted to use
206:Detail of functionality
967:Ubiquity citation tool
399:HTML meta tag content.
32:What CiteCompletion is
934:templated citations).
911:field where relevant.
741:hollywoodreporter.com
521:|accessdate=
493:|accessdate=
305:|accessdate=
148:|deadurl=yes
122:|accessdate=
583:is stored under the
504:|publisher=
168:supported sites list
152:supported sites list
53:supported sites list
723:theglobeandmail.com
671:accessmylibrary.com
486:|location=
300:|location=
117:|location=
729:huffingtonpost.com
689:chicagotribune.com
622:washingtonpost.com
940:User:Citation bot
909:|agency=
836:</NewsSite>
833:</Encoding>
805:</Location>
796:TheDailyTelegraph
613:independent.co.uk
610:timesonline.co.uk
574:XML settings file
506:etc. is not set).
464:|author=
387:Parse HTML source
343:|author=
286:|author=
103:|author=
40:It is written by
978:
951:
945:
910:
890:
886:
885:|first=
837:
834:
830:
829:<Encoding>
827:
823:
820:
819:</Authors>
816:
813:
809:
806:
802:
801:<Location>
799:
795:
792:
788:
785:
784:<NewsSite>
674:post-gazette.com
659:findarticles.com
649:findarticles.com
582:
546:
540:
536:
530:
522:
518:
505:
501:
494:
487:
481:
480:|last2=
477:
476:|last1=
473:
472:|first=
469:
465:
460:is set as found.
459:
454:is set as found.
453:
452:|title=
381:
375:
371:
365:
358:
352:
344:
340:
336:
335:|title=
321:Assess citations
316:Processing logic
311:
306:
301:
295:
294:|first=
291:
287:
282:
277:
276:|title=
255:
249:
245:
239:
235:
229:
225:
219:
149:
145:
141:
135:
128:
123:
118:
112:
111:|first=
108:
104:
99:
94:
93:|title=
87:
81:
77:
71:
67:
61:
986:
985:
981:
980:
979:
977:
976:
975:
949:
943:
924:
908:
898:
889:|last=
888:
884:
880:
871:Supported sites
863:
858:
839:
838:
835:
832:
828:
826:</Titles>
825:
821:
818:
815:<Authors>
814:
811:
807:
804:
800:
797:
793:
790:
789:telegraph.co.uk
786:
783:
773:
763:
762:
714:bizjournals.com
710:
692:dailymail.co.uk
652:
616:telegraph.co.uk
593:
591:Supported sites
581:|date=
580:
570:
560:
544:
538:
534:
528:
520:
517:|date=
516:
513:
503:
500:|work=
499:
492:
485:
479:
475:
471:
468:|last=
467:
463:
458:|date=
457:
451:
415:
389:
379:
373:
369:
363:
356:
350:
342:
339:|date=
338:
334:
331:Supported sites
323:
318:
310:|work=
309:
304:
299:
293:
290:|last=
289:
285:
281:|date=
280:
275:
269:
253:
247:
243:
237:
233:
227:
223:
217:
213:
208:
196:
183:
160:
147:
143:
139:
133:
127:|work=
126:
121:
116:
110:
107:|last=
106:
102:
98:|date=
97:
92:
85:
79:
75:
69:
65:
59:
48:as a bot task.
34:
29:
22:
21:
20:
12:
11:
5:
984:
974:
973:
964:
958:
957:
956:
937:
936:
935:
923:
920:
919:
918:
915:
912:
905:
897:
894:
893:
892:
879:
876:
875:
874:
867:
862:
859:
857:
854:
853:
852:
849:
846:
822:<Titles>
812:</Dates>
810:DC.date.issued
782:
777:per-site rules
772:
769:
761:
760:
757:
754:
751:
748:
747:oregonlive.com
745:
742:
739:
738:irishtimes.com
736:
735:independent.ie
733:
732:nzherald.co.nz
730:
727:
724:
721:
720:denverpost.com
718:
715:
711:
709:
708:
705:
704:indiatimes.com
702:
699:
696:
693:
690:
687:
684:
681:
678:
675:
672:
669:
666:
665:pqarchiver.com
663:
660:
657:
653:
651:
650:
647:
644:
641:
638:
635:
632:
629:
626:
623:
620:
617:
614:
611:
608:
607:guardian.co.uk
605:
602:
599:
598:news.bbc.co.uk
595:
594:
592:
589:
578:news.bbc.co.uk
569:
568:Per-site rules
566:
565:
564:
559:
556:
555:
554:
551:
548:
512:
509:
508:
507:
497:
489:
483:
461:
455:
445:
444:
440:
437:
434:
431:
428:
425:
422:
414:
411:
410:
409:
406:
403:
400:
393:per-site rules
388:
385:
384:
383:
360:
322:
319:
317:
314:
313:
312:
307:
302:
297:
283:
278:
268:
265:
264:
263:
260:
257:
212:
209:
207:
204:
195:
192:
191:
190:
187:
182:
179:
178:
177:
174:
171:
164:
159:
156:
130:
129:
124:
119:
114:
100:
95:
33:
30:
28:
25:
15:
9:
6:
4:
3:
2:
983:
972:
968:
965:
962:
959:
954:
953:
948:
941:
938:
932:
931:
929:
926:
925:
916:
913:
906:
903:
902:
901:
882:
881:
872:
868:
865:
864:
850:
847:
844:
843:
842:
808:<Dates>
798:</Work>
780:
778:
771:Settings file
768:
767:
758:
755:
752:
750:seattlepi.com
749:
746:
743:
740:
737:
734:
731:
728:
725:
722:
719:
716:
713:
712:
706:
703:
701:economist.com
700:
697:
694:
691:
688:
685:
682:
679:
676:
673:
670:
667:
664:
662:theage.com.au
661:
658:
655:
654:
648:
645:
642:
639:
636:
633:
630:
627:
624:
621:
618:
615:
612:
609:
606:
603:
600:
597:
596:
588:
587:meta value).
586:
579:
575:
562:
561:
552:
549:
543:
542:use mdy dates
533:
532:use dmy dates
526:
525:
524:
498:
496:
490:
484:
466:is set using
462:
456:
450:
449:
448:
441:
438:
435:
432:
429:
426:
423:
420:
419:
418:
407:
404:
401:
398:
397:
396:
394:
378:
368:
361:
355:
348:
347:
346:
332:
328:
308:
303:
298:
284:
279:
274:
273:
272:
261:
258:
252:
242:
232:
222:
215:
214:
203:
201:
188:
185:
184:
181:Compatibility
175:
172:
169:
165:
162:
161:
155:
153:
138:
125:
120:
115:
101:
96:
91:
90:
89:
84:
74:
64:
56:
54:
49:
47:
43:
38:
19:
18:User:Rjwilmsi
950:}}
947:cite journal
944:{{
899:
840:
794:<Work>
791:</URL>
774:
765:
764:
726:scotsman.com
698:thesun.co.uk
634:newsbank.com
628:usatoday.com
584:
577:
571:
545:}}
539:{{
535:}}
529:{{
514:
491:
446:
416:
390:
380:}}
374:{{
370:}}
364:{{
357:}}
351:{{
325:Each of the
324:
270:
254:}}
251:cite journal
248:{{
244:}}
238:{{
234:}}
228:{{
224:}}
218:{{
197:
194:Availability
140:}}
134:{{
131:
86:}}
80:{{
76:}}
70:{{
66:}}
60:{{
57:
50:
39:
35:
928:WP:REFLINKS
787:<URL>
695:cbsnews.com
686:foxnews.com
646:reuters.com
640:news.com.au
637:variety.com
631:latimes.com
619:thestar.com
601:nytimes.com
547:if present.
511:Date format
46:RjwilmsiBot
878:Infrequent
831:iso-8859-1
717:forbes.com
668:boston.com
656:sfgate.com
643:smh.com.au
558:Completion
146:, and set
869:Only the
759:pcmag.com
756:wired.com
707:hindu.com
377:cite news
354:cite news
221:cite news
137:dead link
63:cite news
971:ubiquity
907:Set the
861:Frequent
604:time.com
443:format).
367:cite web
241:citation
231:cite web
83:citation
73:cite web
42:Rjwilmsi
841:Notes:
683:wsj.com
625:cnn.com
527:Follow
288:(using
154:below.
105:(using
55:below.
37:rules.
27:Summary
817:author
803:London
753:ew.com
744:rte.ie
677:cbc.ca
170:below.
142:using
824:title
296:etc.)
113:etc.)
16:<
519:and
478:and
246:and
78:and
887:or
537:or
474:or
341:or
200:AWB
952:)
470:,
337:,
292:,
256:).
236:,
226:,
109:,
88::
68:,
382:.
359:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.