Help:Monitoring Data Quality

(Difference between revisions)
Watchers
Revision as of 01:55, 16 May 2022 (edit)
DataAnalyst (Talk | contribs)
(Interacting with the list)
← Previous diff
Revision as of 01:57, 16 May 2022 (edit)
DataAnalyst (Talk | contribs)
(Interacting with the list)
Next diff →
Line 14: Line 14:
===Interacting with the list=== ===Interacting with the list===
Click the links on the list to see and correct issues. In addition to that, you can: Click the links on the list to see and correct issues. In addition to that, you can:
-* Mark an anomaly as verified by clicking the "Verified by me" button. This means that you have reviewed the situation and determined that the data is correct - for example, that a person was truly born before their biological parents married. +* Mark an anomaly as verified by clicking the "Verified by me" button. This means that you have reviewed the situation and determined that the data is correct - for example, that a person was truly born before their biological parents married, and was not from a previous marriage/relationship of one of the parents.
** Before marking an issue as verified, ensure that the page (or a related page, such as the family page) has the sources that prove the information to be correct. ** Before marking an issue as verified, ensure that the page (or a related page, such as the family page) has the sources that prove the information to be correct.
** When you select the "Verified by me" button, a template is added to the Talk page of the indicated Person or Family page (the Talk page will be automatically created if it doesn't already exist). This template identifies you and the date you clicked the "Verified by me" button. Others will see this information when they open the Talk page. ** When you select the "Verified by me" button, a template is added to the Talk page of the indicated Person or Family page (the Talk page will be automatically created if it doesn't already exist). This template identifies you and the date you clicked the "Verified by me" button. Others will see this information when they open the Talk page.

Revision as of 01:57, 16 May 2022

WeRelate allows you to check for possible errors in your data by visiting the Data Quality Issues page.

Contents

Description and definitions

  • Data quality issues are identified by a job that runs periodically. The Data Quality Issues page shows the results from the last time the job was run. The run date/time, which reflects Greenwich Mean Time (UTC), is displayed at the top of the page. Note that the data may be a few hours older than the run date/time due to the way the processing is done.
  • You cannot request a real-time issue check. If you just added or changed some data, you'll have to wait for the next run to check for issues.
  • Issues may be:
    • Anomalies - situations that are unusual enough to warrant review but might be correct, such as a person who married at age 6 or a person who was born before their parents were married
    • Errors - situations that are not correct, such as a person who married after they died, or a person who was born before a parent was born
    • Incomplete data - situations where minimal data about a person, such as gender, is missing
  • Note
    • Situations where sources are missing or incomplete might be added to this list in the future (or possibly a separate list)

Interacting with the list

Click the links on the list to see and correct issues. In addition to that, you can:

  • Mark an anomaly as verified by clicking the "Verified by me" button. This means that you have reviewed the situation and determined that the data is correct - for example, that a person was truly born before their biological parents married, and was not from a previous marriage/relationship of one of the parents.
    • Before marking an issue as verified, ensure that the page (or a related page, such as the family page) has the sources that prove the information to be correct.
    • When you select the "Verified by me" button, a template is added to the Talk page of the indicated Person or Family page (the Talk page will be automatically created if it doesn't already exist). This template identifies you and the date you clicked the "Verified by me" button. Others will see this information when they open the Talk page.
  • Defer an issue by clicking the "Defer" button. This allows you to track issues that you are not prepared to address just yet or maybe ever.
    • For example:
      • Maybe you are working on your own project but choose to clean up a few issues each day, and are looking for "low-hanging fruit" such as simple date typos. You might want to defer larger problems such as a page that conflates 2 individuals until you are prepared to devote the time required for the necessary research.
      • Maybe you need to ask a family member for the correct data and are waiting for a reply.
      • Maybe you don't have the necessary expertise or access to sources to resolve the issue.
    • When you select the "Defer" button, a template is added to the Talk page of the indicated Person or Family page (the Talk page will be automatically created if it doesn't already exist). You will have an opportunity to add a comment (e.g., "conflated persons", "waiting for a reply"). The template identifies you and the date you clicked the "Defer" button, and includes the comment. Others will see this information when they open the Talk page.

Filtering the list

When you first open the Data Quality Issues page, the list reflects the entire database, except for anomalies that have been verified.

You can filter the list by category (anomalies, errors, incomplete data) and/or choose to include verified anomalies.

  • If you choose to include verified anomalies, the list will indicate who verified each anomaly. An anomaly can be verified by more than one user - in fact, a second set of eyes can increase the reliability of the data, since everyone makes mistakes at some point.

If you are signed in, you can also restrict the list as follows:

  • If you've created one or more MyTrees, use the MyTrees dropdown to select a MyTree
OR
  • Select "Watched only" or "Unwatched only", depending on your interest

The latter two restrictions can cause performance issues. For that reason, when you filter on a MyTree or your watchlist, the system will restrict the number of issues displayed at a time. This is automatic, and you will be informed of the limit.

List order

The list is in alphabetical order: Person pages by last name, first name followed by Family pages by page title. It is possible that the order of family pages will be changed in the future.