Hi, I would like to share with you some article from reuters. I know, that this is not purely scientific paper, but I think it is really important topic from data scientist point of view and it show, that even such giants in IT and AI like Amazon can make mistakes in assumptions of some AI applications.
Link to the article:
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
We are standing on the edge of tomorrow, where we want to give more and more task to AI hoping that it will be faster, cheaper and will bring us better results, but maybe sometimes this could be a trap?
AI is designed to learn from data and sometimes this mean that it replicates bad habits from peoples who produce this data. Of course the amount of biased observation need to have critical mass to bias whole model, but in some applications this can happen and we as data scientists should be carefull and test our models extensively.
Moreover, using AI means usually that we will freeze current state of process. This could be good for mature processes, but if some processes require more flexibility they may cut our flexibility and we might end up doing thing like we did in past without any improvments.
Questions:
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
2. Do you think bias in AI could be a really big issue in future?
3. Can you find simillar application where models are not imparial?
4. Do you have some idea how to predict that our model will have issue like this?
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteIt's hard for me to answer 100% where the truth lies, it can make sense, but I can not judge from just one article. I do not know if the source from which it comes is worthy of interest, or is it just a jealousy of a person who fights against the corporation.
2. Do you think bias in AI could be a really big issue in future?
The number of biased AI systems and algorithms will increase. Bad data can contain implicit racial, gender, or ideological biases. Many AI systems will continue to be trained using bad data, making this an ongoing problem. Identifying and mitigating bias in AI systems is essential to building trust between humans and machines that learn. As AI systems find, understand, and point out human inconsistencies in decision making, they could also reveal ways in which we are partial, parochial, and cognitively biased, leading us to adopt more impartial or egalitarian views. In the process of recognizing our bias and teaching machines about our common values, we may improve more than AI.
3. Can you find simillar application where models are not imparial?
No I do not know any.
4. Do you have some idea how to predict that our model will have issue like this?
Unfortunately I have no idea.
Bad data can have many meanings. I remember one of my project, where customer tell us, to do not rely on his historical data, because it was prodused by biased or sometimes lazy peaple and he want us to create him an AI system, to be fair. It was really hard to explain, that without a reliable historical data it is impossible to create good AI algorithm.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteI have not the foggiest idea. Maybe my head hunters have some influence on it. Only if you can say something in 100% after reading one letter? I would be far from drawing 100% of applications on this basis.
2. Do you think bias in AI could be a really big issue in future?
AI can be difficult in itself. Today we are delighted with AI because we are just striving to develop it. What this will lead to in the future is really hard to say.
Perhaps, however, the predictions of the creators of SF films from Hollywood and AI will prove to lead to annihilation. The destruction of our species.
3. Can you find simillar application where models are not imparial?
I have not seen anything like that, at least so far.
4. Do you have some idea how to predict that our model will have issue like this?
Unfortunately not.
I believe it is not only a predictions of the SF move creators, but also some great minds like Howking or visioners like Musk. But maybe it is inevitable, because everyone is warning us, but noone seems to have a way to stop this.
DeleteThis is a contentious issue. I don't accept the victimhood cult that underlies the claims of this article. Imagine for a moment that there was an AI "biased against men" (for most people this even sounds ridiculous - because men inherently are somehow exempt from discrimination?). Would we be calling the alarm like in this case, or would everybody just shrug and say "no problem, this can help even out the wage gap".
ReplyDelete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as they were previously hiring more man than woman?
Yes, it is a product of the input data. No, it's not malice on part of the recruiters. The real error is using historical data about a group to evaluate an individual member of that group. We should go back to evaluating every candidate individually - and not give score for their race, sex and sexual preferences.
2. Do you think bias in AI could be a really big issue in future?
I always say that decision makers' stupidity is the issue, not a specific technology like AI.
3. Can you find simillar application where models are not impartial?
Yes, even more famous is the "future crime prediction" case. It was supposed to predict if a second offence, or probation violation, would occur. It turned out it had false positives on black criminals and false negatives on white criminals. I don't have the link to the story, but there was also a lot of fuss about 'bias'. Well, in Machine Learning we have the bias vs variance tradeoff, but the real problem is, we can't always predict the future based on what happened before.
4. Do you have some idea how to predict that our model will have issue like this?
Yeah, let's build another AI to measure bias.
Great example with 'future crime prediction'. This could be an example of data, where the longer history doesn't mean better predictor of future values. Generally AI usually have a problem to predict rapidly changing environment.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a
ReplyDeletematter of skewness in data as their previously hire more man than woman?
It is very interesting article - I did not realize that it happens! I think that mentioned situation results from fact that the most of data came from men. Women are in the minority so that AI tools were trained basing on the observations of patterns in CV submitted to the company where the man were in majority.
2. Do you think bias in AI could be a really big issue in future?
Personally, I do not know but I suppose that bias in AI could be a challenge for our and future generations. The artificial intelligence has a large potential but it can cause various threats. The AI algorithms are based on data so if we have for example: non-equivalent sets or mistakes in our data, we can obtain false results. AI systems are only as good as the data we put into them. Bad data used to train AI can contain implicit racial, gender, or ideological biases. The AI is very fast growing technology so there is a hope that the problems concerned with bias in AI will be solved. That's especially when such decisions can directly harm a person's life or liberty.
3. Can you find simillar application where models are not impartial?
Yes, I found the example of the application where the models are not impartial. COMPAS program uses machine learning and historical data to predict the probability that a violent criminal will reoffend. It incorrectly predicts that the black people are more likely to reoffend than they do. Here is a link: https://medium.com/thoughts-and-reflections/racial-bias-and-gender-bias-examples-in-ai-systems-7211e4c166a1
4. Do you have some idea how to predict that our model will have issue like this?
Unfortunately, I do not have any idea how to predict the fact that our model will give issue like you mentioned in your presentation for now. I think that finding the idea requires time and tests but I suppose that obtaining the equivalent sets of data is important as well.
Yes, I also personaly think that it is a matter of overrepresentation of men in this data, but a problem is deeper and shows, that AI solutions can stop changes and be a guardian of status quo, doesn't matter if this state is fair or not and if new state will be better or not. So we need to remember about some basic flexibility after we release some AI solution.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteI think this is obvious because such a recommendation system is learned from previous data. Recommendation systems are a subclass of information filtering systems, the purpose of which is to automatically or semi-automatically remove from the information stream those data that are unnecessary or unwanted. Basically, there are two main types of recommendation systems. The first one, called Content Based Filtering, takes into account the customer's purchase history and product description, product type, category, etc. The first one, called Content Based Filtering, takes into account the customer's purchase history and product description, type, category, etc. The second one, called Content Based Filtering, takes into account the customer's purchase history and product description. The second approach, Collaborative Filtering, recommends to users products that are rated or bought by users with similar interests.
2. Do you think bias in AI could be a really big issue in future?
The main problem is scalability. In the case of systems where the number of users is counted in millions as well as the number of items in the catalogue, the use of standard recommendation algorithms is unacceptable due to the lack of possibility to obtain recommendations in a short period of time. Another problem is the problem of rarity. Based on large quantities of products (counted in millions), the user-item used in the collaborative filtering approach will be very rare and the quality of recommendations is put to the test. There is also the problem of the so-called cold start. It occurs when a new user appears or an item is added to the directory. New items cannot be recommended until some users buy and rate them. New users cannot receive a good quality recommendation because they have made too few ratings or have an empty purchase history.
3. Can you find simillar application where models are not imparial?
I think that these algorithms belonging to the Memory based category use the entire user rating database to generate recommendations. These systems use statistical methods, such as Pearson correlation, to find a group of users called neighbours who are similar to an active user (they have rated the same items similarly, or are used to buying similar sets of items). When a group of neighbours has been found, a recommendation is calculated on the basis of the items they have graded. Techniques known as near-neighbour or Memory based Collaborative Filtering are popular and widely used in practice.
4. Do you have some idea how to predict that our model will have issue like this?+
The main functions of recommendation systems are the analysis of user data and the extraction of useful information to use for recommendations. These user data may include search history in the search engine, shopping history in the online store or auction system, the most frequently viewed items in the online store, and even information about what you like posted by social networking users.
Thank you for your great description of recomendation systems.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteI think both of these things could have had an impact on it. But, as others mentioned, it is difficult to refer to the theses presented in the article only on its basis without knowing the subject more precisely.
2. Do you think bias in AI could be a really big issue in future?
Definitely. Humanity produces more and more data and the challenge will be to process it, and proper preparation of data is crucial in artificial intelligence algorithms.
3. Can you find simillar application where models are not imparial?
I haven't found such cases yet.
4. Do you have some idea how to predict that our model will have issue like this?
I have no idea, but perhaps some advanced mathematical statistics would work here.
Yes, some mathematical cross testing of population could work in this case, but the problem would be if at some point some another change will show up. To predict such situation you will need to have more complex AI solution that understands not only the hiring, but also some aspects of environment where this company exists.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteThe Amazon system was based on the resumes submitted to the company over a 10-year period. So, based on historical data, the result was a mirror of headhunters preferences. It's not a problem that their new recruiting engine did not like women, but it's a problem that people from HR were biased on one gender whole that time.
2. Do you think bias in AI could be a really big issue in future?
The problem of bias goes back as far as the term “big data,” and even before that was recognised with the old saying, “garbage in, garbage out.” so I think that it would be with us for a while :)
3. Can you find similar application where models are not imparial?
Anything such spectacular comes to my mind, the case of Amazon is the most famous one that I was aware of.
4. Do you have some idea how to predict that our model will have issue like this?
Yes! We have the "right to an explanation" our models so we should make double-check how our models work.
2016:
- Local explanations LIME
- Why should I trust you?
- Local model approximation
2017:
- SHAP (python)
2018:
- On the robustness of interpretability methods
- DALEX
- modelDown
Thank you for your comment, but really this was a problem of BIASed HR people or this was a problem universities which 'produce' 90% men IT enginers and only 10% women?
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteI think that problem is in historical data. In the graphics in the article we see that more men is hired that womens. AI saw that and make decision that in the comapny need male workers. We need to skewness data for better resaults of AI.
2. Do you think bias in AI could be a really big issue in future?
I don’t think so. We need a good prepared learning data for AI that will satisfy your requirements. Mistakes in data could generate false results.
3. Can you find simillar application where models are not impartial?
I can’t. I don't remember any application where models are not impartial.
4. Do you have some idea how to predict that our model will have issue like this?
Maybe we should write tests to verify the results that AI will produce.
I think the scary part is, that we can't name it 'bad data' as this was a common pattern in IT industry. Maybe just not always based on history we can predict the future?
DeleteHi Tomasz, thank you for bringing this hot and controversial topic for discussion.
ReplyDelete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
There maight by many reason of this bias, disproportion between women and men in training set definitely impact the results, the past trend of hiring mostly men could also influenced it. But the reason why AI based hiring algorithm choosing mostly men can be slightly different. Maybe the persons responsible for creating the tool was not objective in designing algorithm logic? Maybe they emphasis men attributes more. I think that hiring process should be gender insensible.
2. Do you think bias in AI could be a really big issue in future?
Balance between bias and variance is inseparable part of any predicting algorithm. But taking in to account that we want to rely on AI more and more, In almost any aspect of our life make this very important
3. Can you find simillar application where models are not imparial?
There was Microsoft's AI bot. Less than a day after release, Tay.ai was taken down for becoming a sexist, racist monster. She was supposed be a normal teenage girl. But she turned into a Hitler-loving, feminist-bashing troll.
4. Do you have some idea how to predict that our model will have issue like this?
The perfect model does not exist so it’s always be some issue to deal with, the bias vs variance trade-off will not disappear
I believe the algorythm was fine and they intentionally remove from data information about sex, but algorythm learned based on each word in resume, so if he found, that candidate graduate on some women college or was involved in some women interes clubs he still had a way to bias this candidates.
Delete1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteIn my opinion bias was matter of who and how was preparing, gathering and scraping data for for model training purposes.
2. Do you think bias in AI could be a really big issue in future?
Yes this be real issues the same as bias in any relevant discipline. Judgment in any field should be based on facts and arguments not bias and feeling
3. Can you find similar application where models are not imparial?
To bo honest at this moment I could not think of any similar applications
4. Do you have some idea how to predict that our model will have issue like this?
This is rather problem with the data used for model training not the model topology it self.
I agree that it was a problem of data rather then a topology of model, but data wasn't preselected, so they use whole base of employees and candidates, so how to examinate the data to find such issues?
Delete1. It definitely was a metter of skewness in the data. Still, the problem is why the data was skewed and more man than woman were hired. Not alleviating for that bias was a mistake made by people who created the model.
ReplyDelete2. If it reflects a phenomena that is naturally biased it shouldn't be a problem.
3. Probably all situations where there were an unrepresented class of data and it wasn't handled properly could fit here. For example a face detection solution that didn't work for Afro-Americans.
4. As already mentioned it's a problem with data. Some of ideas to solve such problems are to increase diversity of people working in AI. People from various environment may be sensitive to different biases and detect problems with data easier.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteProbably both, as it is hard to prepare clean dataset for such system.
2. Do you think bias in AI could be a really big issue in future?
It might be, as more decisions will be based on AI. Some of those AIs will be trained on biased data, some - on data that can be interpreted as biased. And in the end - there is this "I can't do anything, it's a computer that said 'no'" mentality that allows to reduce perceived personal responsibility.
3. Can you find similar application where models are not impartial?
I've read about some racial biases in US credit scoring systems (FICO), and some creative ways to counter in "Equality of Opportunity in Supervised Learning" by Moritz Hardt, Eric Price, Nathan Srebro (https://arxiv.org/abs/1610.02413). But I'm sure that even if lack of bias in the model could be mathematically proven, it is possible to gather social media mob screaming "bias!"
4. Do you have some idea how to predict that our model will have issue like this?
I think it is certain that in literally every model some bias can be found, it's just a matter of choosing the scoring of attributes appropriately.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteIn my opinion none of the above. Nobody was faulting here. That's just the reality and it all depends on the studied population. If we checked employment among cleaners, women would certainly be the leader. If we study the employed in technical roles, then most of them are men.
2. Do you think bias in AI could be a really big issue in future?
I don’t think so. Certainly there will be areas in which even people are biased. Since AI is based on data that usually come from human life and are based on human choices, it is known that this problem will be also replicated by AI.
3. Can you find similar application where models are not impartial?
Only one example comes to my mind: making decisions in the critical situation of life choosing. More information you will find in this article: https://www.bbc.com/news/magazine-41504285 .
4. Do you have some idea how to predict that our model will have issue like this?
First of all we should test created AI thoroughly. Then, algorithms detecting bias or other ethical problems should be designed and implemented. In many cases, subjective judgments or relying on partial data can not be avoided.
1. It's hard to tell. It seems to me that it is more a matter of Amazon previously hiring more men. In the end, working in such a place is quite demanding in physical terms, so men have better predispositions to perform it. Artificial Intelligence, which learned on the data based on former employees had the right to make this type of error.
ReplyDelete2. Artificial intelligence is always taught on a set of data. If we extend the data set appropriately and we will make sure that only the evenly distributed data will be in it, then we will not have a problem with the prejudices of the artificial intelligence. If Amazon could take into account other criteria in the selection of future employees, then he would not have encountered the problem.
3. Unfortunately, I do not know such applications.
4. I think that we should consider a much wider area of data than we usually anticipate. We should also look at the criteria we follow when we want to get our result. If the data will not be different and if we choose the learning criteria wrong, then we can get biased AI. I also think that the more objective tests the better, then we can find out that our artificial intelligence does not necessarily go the way we would like.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as they were previously hiring more man than woman?
ReplyDeleteIn my opinion it's a bypdroduct of input data. I don't think it's recruiter's malice or fault. And also I agree with the previous speakers that if it was a case of discrimination of men, no one would make so much fuss about it, if any at all.
2. Do you think bias in AI could be a really big issue in future?
It might be, but as with all problems, someone will eventually come up with a decent solution to this issue.
3. Can you find simillar application where models are not impartial?
Unfortunately, at the moment nothing comes to my mind.
4. Do you have some idea how to predict that our model will have issue like this?
To be completely honest, not really.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteIt is difficult for me to answer this question, unfortunately I am not competent in this. Of course, the first step is to determine the authenticity of the source data.
2. Do you think bias in AI could be a really big issue in future?
I think not. AI is based on a data set, if the data is true, then the prediction of the AI will be true, bitter but true.
3. Can you find simillar application where models are not imparial?
Unfortunately not
4. Do you have some idea how to predict that our model will have issue like this?
I think not, but I can not be sure of it.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteI don't know because they do not privde enough details. They write that they find out the bias, give the examples but we dont know how they find the bias and what i s the reason. They said that they removed this particular gender bias but it is not sure if the bias was removed completly. Even though I think that the bias exist even if human choose the candidates. I think that it is impossible to remove it completly. If there are many candidates and the company can hire only few ones then some good candidates may not get the job.
2. Do you think bias in AI could be a really big issue in future?
I think yes and not only in case of job applications. The reason is that researchers and engineer do not spend enough time and effort on preparing data. In some cases it is difficult and requires domain knowlegde of an expert. Sometimes it is just not the most exciting step of a research. I think that more attention should be paid to the process of preparing data. We should even hire the specialist so we are sure that we have specialist from the field of AI which can focus on their job and other specialists to prepare the data.
3. Can you find simillar application where models are not imparial?
I think that there is a risk of impartial models if we have unbalanced data. I can't think of any other application right now.
4. Do you have some idea how to predict that our model will have issue like this?
I think that first of all we should carefully prepare training data and algorithms. We should know what our data contains and how exactly algorithms work.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteIt's hard to define. Probably after a thorough examination of the data, you could find a record that could have some effect on the results, but that's just guessing.
2. Do you think bias in AI could be a really big issue in future?
It's the first time I meet with this question and honestly I do not know what to answer. AI does exactly what he learns from the teaching set, so the problem of "prejudice" depends as it is now on the creator of the learning set. Its content can be modernized both for the benefit and disadvantage of both sexes.
3. Can you find simillar application where models are not imparial?
Probably every model based on a badly chosen learning set though I don't know any.
4. Do you have some idea how to predict that our model will have issue like this?
Review learning sets for the diversity of input data. No other ideas.
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteUnfortunately I don’t know how exactly mentioned algorithm has worked, so I cannot give you straight answer to this question and guessing just doesn’t feel right to me;) but maybe wrong dataset was the problem.
2. Do you think bias in AI could be a really big issue in future?
As it seems to me I think that the problem may be related with getting new data. For example you can train AI for car driving but what will happen when these cars go to Poland, where almost all drivers do not comply with the rules? We cannot predict if the AI would apply to the new conditions or just use the solutions which are not appropriate in the new environment.
3. Can you find similar application where models are not impartial?
Personally I think recommendations for shops don’t work properly.
Like it was said on this joke – when i have bought a drill, the shop recommends me drills for the next six months. But why do I need a second drill?;)
4. Do you have some idea how to predict that our model will have issue like this?
I must admit that I don’t know why algorithm has worked as it worked. Maybe you can hide data about worker’s gender and see what will happen?
1. Do you think it is a matter of bias in data produced by head hunters or rather a matter of skewness in data as their previously hire more man than woman?
ReplyDeleteThe data contained in the chart showing the division of employees by gender indicated that in the majority of cases they were men. Maybe the learning data used were wrong and caused bias.
2. Do you think bias in AI could be a really big issue in future?
In my opinion, the selection of relevant data is of key importance to the results obtained and this applies not only to AI but to any other system. Data before use should be appropriately selected and developed for usability.
3. Can you find simillar application where models are not imparial?
Unfortunately, no application comes to my mind at this moment.
4. Do you have some idea how to predict that our model will have issue like this?
I would try to review the data and become familiar with them. However, I am aware that with a large amount of data this may not be possible. Alternatively, I would try to build a supporting application in the selection of data.
Hello Tomasz,
ReplyDeleteMany thanks for inspiring article.
Ad. 1. For me this is rather the first option, i.e. disinformation intentionally produced by people earning money on that. In my opinion such issues don’t appear just like that.
Ad. 2. Definitely yes. Everybody can observe, in how many branches of our life AI has increased its meaning from last several years. We use it even in such sensitive respect, as medicine. So the bias possible everywhere it can be applied, can mean a disaster, difficult to predict and not possible to avoid, I am afraid. The only one solution of this problem I can see in human’s minds, so as people stopped using this technology in wrong way. I mean the society should start working on ethical side of this problem. Otherwise we’re going to live in artificial untrue world.
Ad. 3. I am sorry, but I don’t know such example, this is really difficult to find something not impartial in this respect. And referring to the answers above, there are less and less possibilities to find such issues, unfortunately.
Ad. 4. I am sorry again, but it seems to be sophisticated to imagine. I tried to do it, but I failed. If I think something out, I’ll let you know :-)
BR,
Marta