Data mining in education

19 Jul

Marc Parry has written a fascinating Chronicle of Higher Education article which was also published in the New York Times, Big Data on Campus:

With 72,000 students, A.S.U. is both the country’s largest public university and a hotbed of data-driven experiments. One core effort is a degree-monitoring system that keeps tabs on how students are doing in their majors. Stray off-course and a student may have to switch fields.

And while not exactly matchmaking, Arizona State takes an interest in students’ social lives, too. Its Facebook app mines profiles to suggest friends. One classmate shares eight things in common with Ms. Allisone, who “likes” education, photography and tattoos. Researchers are even trying to figure out social ties based on anonymized data culled from swipes of ID cards around the Tempe campus.

This is college life, quantified.

Data mining hinges on one reality about life on the Web: what you do there leaves behind a trail of digital breadcrumbs. Companies scoop those up to tailor services, like the matchmaking of eHarmony or the book recommendations of Amazon. Now colleges, eager to get students out the door more efficiently, are awakening to the opportunities of so-called Big Data.

The new breed of software can predict how well students will do before they even set foot in the classroom. It recommends courses, Netflix-style, based on students’ academic records.

Data diggers hope to improve an education system in which professors often fly blind. That’s a particular problem in introductory-level courses, says Carol A. Twigg, president of the National Center for Academic Transformation. “The typical class, the professor rattles on in front of the class,” she says. “They give a midterm exam. Half the kids fail. Half the kids drop out. And they have no idea what’s going on with their students.”

As more of this technology comes online, it raises new tensions. What role does a professor play when an algorithm recommends the next lesson? If colleges can predict failure, should they steer students away from challenges? When paths are so tailored, do campuses cease to be places of exploration?

For a really good explanation of “data mining” go to Alexander Furnas’ Atlantic article.

In Everything You Wanted to Know About Data Mining but Were Afraid to Ask, Furnas writes:

To most of us data mining goes something like this: tons of data is collected, then quant wizards work their arcane magic, and then they know all of this amazing stuff. But, how? And what types of things can they know? Here is the truth: despite the fact that the specific technical functioning of data mining algorithms is quite complex — they are a black box unless you are a professional statistician or computer scientist — the uses and capabilities of these approaches are, in fact, quite comprehensible and intuitive.

For the most part, data mining tells us about very large and complex data sets, the kinds of information that would be readily apparent about small and simple things. For example, it can tell us that “one of these things is not like the other” a la Sesame Street or it can show us categories and then sort things into pre-determined categories. But what’s simple with 5 datapoints is not so simple with 5 billion datapoints….

Discovering information from data takes two major forms: description and prediction. At the scale we are talking about, it is hard to know what the data shows. Data mining is used to simplify and summarize the data in a manner that we can understand, and then allow us to infer things about specific cases based on the patterns we have observed. Of course, specific applications of data mining methods are limited by the data and computing power available, and are tailored for specific needs and goals. However, there are several main types of pattern detection that are commonly used. These general forms illustrate what data mining can do.

With the ability to collect vast amounts of information comes the question of what about privacy and what is the definition of privacy?

Ljiljana Brankovic and Vladimir Estivill-Castro discuss privacy issues in Privacy Issues in Knowledge Discovery:

Recent developments in information technology have enabled collection and processing of vast

amounts of personal data, such as criminal records, shopping habits, credit and medical history, and

driving records. This information is undoubtedly very useful in many areas, including medical

research, law enforcement and national security. However, there is an increasing public concern

about the individuals’ privacy. Privacy is commonly seen as the right of individuals to control

information about themselves. The appearance of technology for Knowledge Discovery and Data

Mining (KDDM) has revitalized concern about the following general privacy issues:

· secondary use of the personal information,

· handling misinformation, and

· granulated access to personal information.

They demonstrate that existing privacy laws and policies are well behind the developments in

technology, and no longer offer adequate protection.

We also discuss new privacy threats posed KDDM, which includes massive data collection, data

warehouses, statistical analysis and deductive learning techniques. KDDM uses vast amounts of

data to generate hypotheses and discover general patterns. KDDM poses the following new

challenges to privacy.

· stereotypes,

· guarding personal data from KDDM researchers,

· individuals from training sets, and

· combination of patterns.

In Who has access to student records? Moi said:

There is a complex intertwining of laws which often prevent school officials from disclosing much about students.

According to Fact Sheet 29: Privacy in Education: Guide for Parents and Adult-Age Students,Revised September 2010 the major laws governing disclosure about student records are:

What are the major federal laws that govern the privacy of education records?

  • Family Educational Rights and Privacy Act (FERPA) 20 USC 1232g (1974)

  • Protection of Pupil’s Rights Amendments (PPRA) 20 USC 1232h (1978)

  • No Child Left Behind Act of 2001, Pub. L. 107-110, 115 STAT. 1425 (January 2002)

  • USA Patriot Act, P.L. 107-56 (October 26, 2001)

  • Privacy Act of 1974, 5 USC Part I, Ch. 5, Subch. 11, Sec. 552

  • Campus Sex Crimes Prevention Act (Pub. L. 106-386)

FERPA is the best known and most influential of the laws governing student privacy. Oversight and enforcement of FERPA rests with the U.S. Department of Education. FERPA has recently undergone some changes since the enactment of the No Child Left Behind Act and the USA Patriot Act….

See, The Federal Educational Rights and Privacy Act balancing act

Still, schools collect a lot of information about students.

Jason Koebler has written an interesting U.S. News article, Who Should Have Access to Student Records?

Since “No Child Left Behind” was passed 10 years ago, states have been required to ramp up the amount of data they collect about individual students, teachers, and schools. Personal information, including test scores, economic status, grades, and even disciplinary problems and student pregnancies, are tracked and stored in a kind of virtual “permanent record” for each student.

But parents and students have very little access to that data, according to a report released Wednesday by the Data Quality Campaign, an organization that advocates for expanded data use.

All 50 states and Washington, D.C. collect long term, individualized data on students performance, but just eight states allow parents to access their child’s permanent record. Forty allow principals to access the data and 28 provide student-level info to teachers.

Education experts, including Secretary of Education Arne Duncan and former Washington, D.C., Schools Chancellor Michelle Rhee, argue that education officials can use student data to assess teachers—if many students’ test scores are jumping in a specific teacher’s class, odds are that teacher is doing a good job.

Likewise, teachers can use the data to see where a student may have struggled in the past and can tailor instruction to suit his needs.

At an event discussing the Data Quality Campaign report Wednesday, Rhee said students also used the information to try to out-achieve each other.

The data can be an absolute game changer,” she says. “If you have the data, and you can invest and engage children and their families in this data, it can change a culture quickly.”Privacy experts say the problem is that states collect far more information than parents expect, and it can be shared with more than just a student’s teacher or principal.“When you have a system that’s secret [from parents] and you can put whatever you want into it, you can have things going in that’ll be very damaging,” says Lillie Coney, associate director of the Electronic Privacy Information Center. “When you put something into digital form, you can’t control where that’ll end up.”

According to a 2009 report by the Fordham University Center on Law and Information Policy, some states store student’s social security numbers, family financial information, and student pregnancy data. Nearly half of states track students’ mental health issues, illnesses, and jail sentences.Without access to their child’s data, parents have no way of knowing what teachers and others are learning about them.

The U.S. Department of Education enforces FERPA.


Data Mining: How Companies Now Know Everything About You        :,9171,2058205,00.html#ixzz2172ZKahA

What is Data Mining? – YouTube
  ► 3:22► 3:22

Defining Privacy for Data Mining                                                            

Dr. Wilda says this about that ©

2 Responses to “Data mining in education”


  1. Does ‘cloud storage’ affect student privacy rights? « drwilda - February 19, 2013

    […] Data mining in education                                                         […]

  2. U.S. Department of Education guidelines on student privacy | drwilda - February 26, 2014

    […] […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: