SPONSORS:






User Tag List

Thanks Thanks:  0
Likes Likes:  0
Dislikes Dislikes:  0
Results 1 to 6 of 6
  1. #1
    Member
    Join Date
    Jun 2008
    Location
    Abu Dhabi, UAE
    Posts
    69
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Total Downloaded
    0

    Data Quality Measures

    Hi,

    I have been working in order to make sure the quality of the huge amount of data that client is giving me. The record is in the excel sheet and it is for the bank. The excel sheet is being exported from an Oracle Data Base.
    By using the excel functions I have sorted the data that is in the millions upon account numbers and their respective details.

    But he wants me to develop a QA process that can always assure the quality of the data so that they can use it for their respective purposes whenever they need.

    Well, if anyone has worked on it before or working kindly guide me. If their is any such QA process already exists for the Data Quality Assurance then please guide.

    Thanks,
    Adeel
    Keep Sharing, Keep Growing

  2. #2
    Moderator Joe Strazzere's Avatar
    Join Date
    May 2000
    Location
    USA
    Posts
    13,170
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Data Quality Measures

    [ QUOTE ]
    But he wants me to develop a QA process that can always assure the quality of the data so that they can use it for their respective purposes whenever they need.

    [/ QUOTE ]

    What does he mean by "quality of the data" in this case?
    Joe Strazzere
    Visit my website: AllThingsQuality.com to learn more about quality, testing, and QA!

  3. #3
    Moderator
    Join Date
    Mar 2004
    Location
    West Coast of the East Coast!
    Posts
    7,756
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Total Downloaded
    0

    Re: Data Quality Measures

    Sounds like a job for some sort of automation! You wouldn't need much of a tool just for that.
    Personal Comment

    Success is the ability to go from one failure to another with no loss of enthusiasm.
    ~ Winston Churchill ~


    ...Rich Wagner

  4. #4
    Moderator
    Join Date
    Sep 2001
    Location
    Yankee Land
    Posts
    4,055
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Total Downloaded
    0

    Re: Data Quality Measures

    Data Quality Testing is hardly new, there is a fair bit out there if you take the time to look.
    - M

    Nothing learns better than experience.

    "So as I struggle with this issue I am confronted with the reality that noting is perfect."
    - Unknown

    Now wasting blog space at QAForums Blogs - The Lookout

  5. #5
    Moderator Joe Strazzere's Avatar
    Join Date
    May 2000
    Location
    USA
    Posts
    13,170
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Data Quality Measures

    "Data Quality" means different things to different companies.

    My company has a division in India gathering our core data for us. We have a huge infrastructure involving people, processes, and tools that helps ensure the quality of the data.

    For example, we have a process where two different people gather the same data and put it into the database. If they have captured different values for some reason, that is noted in a queue. A different, more senior, group of folks need to go in and determine which (if either) of the values are correct.

    Then the data is passed on to another team which performs statistical sampling of the data, and re-checks the values. Rejects need to go back for futher analysis.

    Then there are still other levels of checking on top of that, and feedback mechanisms where the data can be challenged by customers.

    For my company, this is what we would mean by "Data Quality". But it's a business decision, based on the importance of having the data be correct from the beginning, and the risk of incorrect data. For other companies, the content is owned by others and assumed to be correct. For them "Data Quality" might mean just making sure that data isn't lost or transformed incorrect as it moves from one system to another (such as from a computer to a spreadsheet). For others, "Data Quality" might mean ensuring that clients can't enter wildly bad data into a website.
    Joe Strazzere
    Visit my website: AllThingsQuality.com to learn more about quality, testing, and QA!

  6. #6
    Member
    Join Date
    Oct 2003
    Location
    New Zealand
    Posts
    97
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Total Downloaded
    0

    Re: Data Quality Measures

    This is a really interesting problem. Data quality means many things, including the reliability of the collection method and possible errors in migration and translation. When you have such a large amount of data that that appears to be pseudo-random, it becomes a mathematical problem. The only thing comes to my mind is to do statistical analysis, but you have to find someone who can assist you. Usually, there are statistical analysts in banks, so it is useful if you can talk to one.

    Years ago, I had to help a friend to analyse rainfall data for quality. Though it was only several thousand sets of data, the situation was similar. You get a spreadsheet with time and rainfall. The first thing is looking for outliers - for example, if you find 10 meter rainfall one day, but you never had to get to the roof to avoid flood, it is likely to be 10cm (or mm depending on the units used) rather than 10m. Then we compared them with other available data such as flow rate in rivers and known annual average rainfall values, etc.

    In your case, you can start with analyzing say top 1% and bottom 1% for outliers. For example, if you are looking at age (if it is one of the data columns) this will help you to find 200 years old people and babies who actively work with the bank :-). Then doing some profiling would help, by interrelating data you have. This is where statistical analysis comes into play. A very simple example is say, the age group 30-40 consists of 50% of account holders who have more than a certain amount of money in the bank according to industry data. If you find it is only 20% in your sample, you have to check why it is. I am no statistics' expert so can't say this in more detail, but I'm sure you will be able to find someone from the bank.

    Hope this helps,

    Krish

 

 

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Search Engine Optimisation provided by DragonByte SEO v2.0.36 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Resources saved on this page: MySQL 6.67%
vBulletin Optimisation provided by vB Optimise v2.6.4 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
User Alert System provided by Advanced User Tagging v3.2.8 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
vBNominate (Lite) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Feedback Buttons provided by Advanced Post Thanks / Like (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Username Changing provided by Username Change (Free) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
BetaSoft Inc.
Digital Point modules: Sphinx-based search
All times are GMT -8. The time now is 07:57 AM.

Copyright BetaSoft Inc.