Improved Searches... and Duplicates!
by TheOldBill on Comments
In the last few days, the good people at TV.com finally fixed the search engine: hurrah! I am really chuffed! Unfortunately, I can now see the full extent of the duplicates problem. Finding duplicates is like shooting fish in a barrel now. I had successfully been adding "do not use" tags to duplicates in the range to 373963 after clearing them of credits, but for every duplicate created at TVTome there are three or four created here. For two show guides, almost every entry is a duplicate - the contributor/editor in question appears not to have known how to search for existing IDs. Regarding which one to choose: there are no absolute rules reagarding the order in which duplicates appear and, frustratingly, some actor IDs appear in different orders in a manual search than they do in the Search for Actor field in the Add Stars page. However, testing over the past few days suggest that, in general, the older, TVTome created IDs appear last in any search. I believe this is because the system looks for best matches, and most TVTome IDs do not have separare entries in the First Name and Last Name fields. However, afenla points out in the forums that, after searching for an actor to add, you can right-click in the Add A New Guest Star section and view source. This shows the ID number for each listed option, allowing you to compare with the manual search results and find the correct ID. By way of illustration, searching for "Garfield Morgan" now yields the following results: Garfield Morgan Garfield Morgan Garfield Morgan - do not use Garfield Morgan The first, 396089, was entered with a blank space between names, so the system did not pick up on the match with 384725 and created a new ID. Because of the "space-as-middle-name" bug, it now has two blanks. The second, 384725, is affected by the "space-as-middle-name" bug. It was created by the system because it did not match either of the legacy TVTome names, neither of which had First Name or Last Name attached. The third, 201047, was created for show 1408. (I don't know why, but almost every cast entry for that show had a duplicate ID in the 200*** series at TVTome.) This used to be one of only two hits until the search engine was fixed, so I tagged it to stop people assigning credits to it after twice clearing it of credits. The fourth, 16741, is the original and best. I deal primarily with old British dramas, a relatively shallow gene pool of actors. As things stand, I now find that one in every three attempts to add a cast member yields a duplicate ID in the range 373967 up, generally from one of three shows. Each takes a while to investigate and correct and, frankly, if there are more than two possible hits, it is impossible to correct an entry in a show edited by someone else with any degree of confidence that you are adding the correct entry. Beyond that, I continue to flag duplicate IDs with "do not use" tags. However, now that there are hundreds of incorrect person IDs listed at Add Stars, it is inevitable that duplicates will attract additional credits, even after they are cleared. There is an extensive list of duplicates in the forums: hopefully the staff will start clearing the duplicates one day! Note: TVTome IDs: up to 372591 TV.com IDs (double space): 372597-410906 TV.com IDs (correct format): 410909 up TV.com IDs from 373967 up are now searchable If in doubt, use the last listed ID in any search