# Math 640: EXPERIMENTAL MATHEMATICS Class Project

http://sites.math.rutgers.edu/~zeilberg/EM19/ClassProject.html

Last Update: April 28, 2019.

The Class Group project coordinator is Yukun Yao.

Here are the stages

# Step I: Data collection

• Yukun Yao will email you with a list of between 4 to 6 Rutgers Math faculty.

Go to Tenure-Tenure Track math faculty directory and for each of the names find its rank with the following code:

• Assistant Professor: 0
• Associate Professor: 1
• Professor: 2
• Distinguished Professor (including Special professors Iwaniec and Lebowitz): 3
For each of your assigned names, Start a list in the form
[FirstNameLastName,Rank, ...]
For example: [DoronZeilberger,3]
• Then go to MathSciNet (google "Rutgers MathSciNet", and then login with your netid and password) and for each name search for it in the Author field, and click on it. Then get the number of publications, and number of citations. To get the H index, you click on "Citations", an manually determine the h-index that is obtained by going down the list of citations starting at the first one until you get to the largest i such that i-th most cited paper has >=i citations. For example for "Doron Zeilberger" the h-index is 23.

• Then google "Rutgers Salaries" and determine the BASE SALARY of each of your people. (It may be convenient to get to first get SAS-Mathematics and then look within it, it takes a long time to search for an individual)

Combine the data into a list

[FirstNameLastName,Rank, NumberPublications, NumberCitations, Hindex, BaseSalary]

For example:

[DoronZeilberger, 3 , 211, 2432, 23, 217337]

• Combine all these lists into one list with the names in alphabetical order.
Then Yukun will catenate these lists into an ABT, ready for data science.

Then Yukun will assign you one of several data analysis tasks, to be specificed soon.

# Step II: Data Analysis, Some suggestions

• Use Logistic multi-variable Logistic Regression that would predict the probability of having rank at least 3, at least 2, at least 1 based on the number of publications and number of citations and possibly (if not too hard, H-index)

• Do simple regression (Least Square) to predict salary from (i) number of publications (ii) number of citations

• Do multi-variable regression to predict salary from the vector [number of publications,number of citations, H-index]

• Do statistical analysis , mean, variance, percentiles (with histograms) and the co-variance matrix of the fields

[NumberOfPublications,NumberOfCitations,Hindex,Salary]

• [Challenge] Construct a neural net, of depth 2 (and if possible 3) to predict the probability of the rank from the vector

[NumberOfPublications, NumberOfCitations]

Using "softmax"