#Experimental Math Spring 2019 Rutgers University #Victoria Chayes #5/12/2019 #Contribution to Class Project Help:=proc(): print(`NLTVPD(DATA,n,CWMParam,FParam,PDHGParam,full): Does non-local total variation using the primal dual gradient descent algorithm to sort data.`): print(`Inputs:`): print(`DATA: the list of data to be sorted, given as a list of lists.`): print(`n: the number of clusters for the data to be sorted into.`): print(`CWMparam= a list of parameters [e,c], for the calculation of the weight matrix (a measure of how close each datapoint is to the other datapoints.) These parameters will influence the overall final clustering, ie changing these parameters changes the output.`): print(`---e: a positive number determining the weight given to the standard Euclidean distance as a measure of the distance between the vectors. Set e=1 and c=0 to construct a weight matrix solely using Euclidean distance.`): print(`---c: a positive number determining the weight given to the cosine similarity as a measure of the distance between the vectors. Set c=1 and e=0 to construct a weight matrix solely using cosine similarity.`): print(`FParam= a list of parameters [e,c,l], for the calculation of the fidelity term of the energy functional. These parameters will influence the overall final clustering, ie changing these parameters changes the output.`): print(`---e,c: the same weights for Euclidean vs cosine distance as above. It is recommended that the same e be chosen in FParam and CWMParam.`): print(`---l: a positive number determining the weight in the energy functional given to the fidelity term.`): print(`PDHGParam= a list of parameters [theta,sigma,tau,tol,innerloop,outerloop]. These parameters will influence the rate of convergence, but not the final clustering, ie changing these parameters changes how long the program runs but not the output.`): print(`---theta, sigma, tau: three strictly positive numbers satisfying 0tol do' and un-comment 'for z from 1 to outerloop do' to have the stopping criterion be a set number of iterations instead of for changes in the assignment function to be less than a certain tolerance. `): print(`full: set to 1 to receive the datapoints directly sorted into n clusters as an output. set to 0 (or any other value besides 1) to receive the final Uhard clustering assignment matrix as the output.`): print(`Outputs:`): print(`The datapoints in a list of n lists of lists, with each of the n lists of lists indicating a sorted cluster of data OR the final Uhard clustering assignment matrix as a list of lists.`): print(`Try:`): print(`NLTVPD(ExtractFields(MR,[3,4,5,6,7,8,9]),4,[1,0],[1,0,10^6],[1,2.5,.01,0,10,25],1)`): print(`Call HelpAux() to see all Auxiliary functions.`): end: HelpAux:=proc(): print(`AverageVector(M), Comp(n,k), ConstructWeightMatrix(DATA,e,c),ConvertHard(i,n), CountInstances(temp,k), DumbThreshhold(U), Fidelity(g,C,e,c), FindFlatIndex(temp), GreaterThanEq(temp,a), NewCentroids(SDATA), NoDoubleCounting(temp), NormalizeData(DATA), Norms(DATA), NonlocalDivergence(W,v), NonlocalGradient(W,U), PDInner(u,f),PermuteList(L,P),ProjectEdge(M), RandomInitialCentroids(DATA,n), SetAToB(M,a,b), SimplexGrid(n,s), SortData(DATA,Uhard), Threshhold(U,s,eta,y)`): end: ################################################################################# ################################################################################# #NLTVPD(DATA,n,CWMParam,FParam,PDHGParam,full): Does non-local total variation using the primal dual gradient descent algorithm to sort data. #Inputs: #DATA: the list of data to be sorted, given as a list of lists. #n: the number of clusters for the data to be sorted into. #CWMparam= a list of parameters [e,c], for the calculation of the weight matrix (a measure of how close each datapoint is to the other datapoints.) These parameters will influence the overall final clustering, ie changing these parameters changes the output. # e: a positive number determining the weight given to the standard Euclidean distance as a measure of the distance between the vectors. Set e=1 and c=0 to construct a weight matrix solely using Euclidean distance. # c: a positive number determining the weight given to the cosine similarity as a measure of the distance between the vectors. Set c=1 and e=0 to construct a weight matrix solely using cosine similarity. #FParam= a list of parameters [e,c,l], for the calculation of the fidelity term of the energy functional. These parameters will influence the overall final clustering, ie changing these parameters changes the output. # e: a positive number determining the weight given to the standard Euclidean distance as a measure of the distance between the vectors. Set e=1 and c=0 to construct the fidelity term solely using Euclidean distance. It is recommended that the same e be chosen in FParam and CWMParam. # c: a positive number determining the weight given to the cosine similarity as a measure of the distance between the vectors. Set c=1 and e=0 to construct the fidelity term solely using cosine similarity. It is recommended that the same c be chosen in FParam and CWMParam. # l: a positive number determining the weight in the energy functional given to the fidelity term. #PDHGParam= a list of parameters [theta,sigma,tau,tol,innerloop,outerloop]. These parameters will influence the rate of convergence, but not the final clustering, ie changing these parameters changes how long the program runs but not the output. # theta, sigma, tau: three strictly positive numbers satisfying 0tol do` and un-comment `for z from 1 to outerloop do` to have the stopping criterion be a set number of iterations instead of for changes in the assignment function to be less than a certain tolerance. #full: set to 1 to receive the datapoints directly sorted into n clusters as an output. set to 0 (or any other value besides 1) to receive the final Uhard clustering assignment matrix as the output. #Outputs: #The datapoints in a list of n lists of lists, with each of the n lists of lists indicating a sorted cluster of data OR the final Uhard clustering assignment matrix as a list of lists. NLTVPD:=proc(DATA,n,CWMParam,FParam,PDHGParam,full) local m1, m2, W, C, theta, opnorm, sigma, tau, tol, d, innerloop, U, Ubar, P, F, i, j, k, innercount, Uhard, hope, rec, temp, tempF, SDATA, COld, Uold, z, outerloop: m1:=nops(DATA): m2:=nops(DATA[1]): C:=RandomInitialCentroids(DATA,n): W:=NormalizeData(ConstructWeightMatrix(DATA,CWMParam[1],CWMParam[2])): theta:=PDHGParam[1]: opnorm:=max(W): sigma:=PDHGParam[2]/opnorm: tau:=PDHGParam[3]/opnorm: tol:=PDHGParam[4]: d:=2*tol+1: innerloop:=PDHGParam[5]: outerloop:=PDGHParam[6]: U:=[seq([seq(1/m1,i=1..n)],j=1..m1)]: Ubar:=U: P:=[[[0$m1]$m1]$n]: while d>tol do #for z from 1 to outerloop do F:=[]: for i from 1 to m1 do F:=[op(F),Fidelity(DATA[i],C,FParam[1],FParam[2],FParam[3])]: od: F:=tau*F+[[1$n]$m1]: innercount:=0: while innercounthope then hope:=rec: fi: od: P[k]:=P[k]/hope: od: temp:=ListTools[Transpose](U): tempF:=ListTools[Transpose](F): for k from 1 to n do temp[k]:=temp[k]+tau*~NonlocalDivergence(W,P[k]): temp[k]:=temp[k]*~(tempF[k]^~(-1/2)): od: U:=ListTools[Transpose](temp): for i from 1 to m1 do tempF:=F[i]^~(1/2): U[i]:=PDInner(U[i],tempF): od: Ubar:=U+theta*~(U-Uold): innercount:=innercount+1: od: #Uhard:=Threshhold(U,ThreshParam[1],ThreshParam[2],ThreshParam[3]): Uhard:=DumbThreshhold(U): SDATA:=SortData(DATA,Uhard): COld:=C: C:=NewCentroids(SDATA): while nops(C)`,output=[sorted, permutation]): sa:=PermuteList(f,a[2]): cond:=a[1]*~(sa^~(-2)): tmpsum:=0: tmpsumdom:=0: ind:=true: for i from 1 to n-1 do tmpsum:=tmpsum+cond[i]: tmpsumdom:=tmpsumdom+sa[i]^(-2): tmax:=(tmpsum-1)/tmpsumdom: if tmax>=a[1][i+1] then ind:=false: break: fi: od: if ind then tmax:=(tmpsum+cond[n]-1)/(tmpsumdom+sa[n]^(-2)): fi: [seq(max(u[i]-tmax/f[i],0)/f[i],i=1..n)]: end: #PermuteList(L,P) #Inputs: #L: a list #P: a list of integers from 1 to nops(L) ordered #Outputs: #A new list Lnew consisting of the elements of L ordered under the permutation P PermuteList:=proc(L,P) local i,n,Lnew: Lnew:=[]: n:=nops(L): for i from 1 to n do Lnew:=[op(Lnew),L[P[i]]]: od: Lnew: end: #AverageVector(M): #Inputs: #M: a matrix represented as list of lists #Outputs: #v: the mean vector of the set of row vectors of M (calculated component-wise), as a list. AverageVector:=proc(M) local v,m,n,i,j: m:=nops(M): n:=nops(M[1]): v:=[]: for i from 1 to n do v:=[op(v),add(M[j][i],j=1..m)/m]: od: v: end: #NewCentroids(SDATA): #Inputs: #SDATA: a list of lists of lists, where each sublist is the list of vectors (represented as lists) assigned together to a cluster. #Ouputs: #C: a matrix represented as a list of n lists, with each row consisting of the vector that is the average of the vectors in SDATA[i] for 1<=i<=n. NewCentroids:=proc(SDATA) local C,n,i: n:=nops(SDATA): C:=[]: for i from 1 to n do C:=[op(C),AverageVector(SDATA[i])]: od: C: end: #DumbThreshhold(U): #Inptus: #U: a matrix given by a list of lists representing the assignment function of the m1 datapoints to n clusters #Outpus: #Uhard: a matrix given by a list of lists which sets the one element of each row of U to 1 indicating the cluster assigned by smart simplex clustering, and every other element to 0. DumbThreshhold:=proc(U) local m1,n,i,Uhard,k: m1:=nops(U): n:=nops(U[1]): Uhard:=[[0$n]$m1]: for i from 1 to m1 do k:=ListTools[FindMaximalElement](U[i],position)[2]: Uhard[i][k]:=1: od: Uhard: end: #Threshhold(U,s,eta,y): performs smart simplex clustering. #Inputs: #U: a matrix given by a list of lists representing the assignment function of the m1 datapoints to n clusters #s: a positive integer. this parameter indicates grid size that will be created on the simplex to check. larger s will result in more accurate results but longer runtimes. #eta: a positive number. this parameter indicates the relative weight of stability vs sparsity. #y: a positive number. this parameter controls the size of the `y-shaped region`, ie how far the datapoints should stay away from the edges of the clusters. #Ouputs: #Uhard: a matrix given by a list of lists which sets the one element of each row of U to 1 indicating the cluster assigned by smart simplex clustering, and every other element to 0. Threshhold:=proc(U,s,eta,y) local m1,n,delta,gs,d,fdel,gdel,A,i,mid,temp,cluster,j,c,champ,rec,hope,champdelta,hard,Uhard,c1: m1:=nops(U): n:=nops(U[1]): gs:=Matrix(n*m1,n,1/(s*y)): delta:=SimplexGrid(n,s): d:=nops(delta): fdel:=Matrix(d,n,0): gdel:=Matrix(d,1,0): A:=ProjectEdge(U): for i from 1 to d do mid:=ProjectEdge(delta[i]): mid:=convert(linalg[blockmatrix](m1,1,[convert(mid,`matrix`)$m1]),`Matrix`): cluster:=Matrix(n,1,0): temp:=A-mid: temp:=GreaterThanEq(temp,0): temp:=MTM[sum](temp,2): temp:=GreaterThanEq(temp,n): temp:=ArrayTools[Reshape](temp, [n,m1]): temp:=NoDoubleCounting(temp): cluster:=FindFlatIndex(temp): cluster:=LinearAlgebra[Modular][Mod](n,cluster,integer): c1:=0: for j from 1 to n-1 do c:=CountInstances(cluster,j): c1:=c1+c: fdel[i,j]:=c/m1: od: c:=CountInstances(cluster,0): c1:=c1+c: if c1=m1 then fdel[i,n]:=c/m1: else fdel[i]:=convert([1$(n-1),-1],`Vector`): fi: temp:=A-mid-gs: temp:=GreaterThanEq(temp,0): temp:=MTM[sum](temp,2): temp:=GreaterThanEq(temp,n-1): gdel[i,1]:=1-MTM[sum](temp)/m1: od: fdel:=MTM[prod](fdel,2): fdel:=SetAToB(fdel,0,10^(-10)): fdel:=convert(fdel,`list`): gdel:=convert(gdel,`list`): if fdel[1]<0 then hope:=infinity: else hope:=evalf(-log(fdel[1])+eta*exp(gdel[1])): fi: champ:=1: for i from 2 to d do if fdel[i]<0 then rec:=infinity: else rec:=evalf(-log(fdel[i])+eta*exp(gdel[i])): fi: if rec=a of temp are set to 1, and negative entries to 0. GreaterThanEq:=proc(temp,a) local Ind,t1,t2,i,j: if type(temp,`Vector`) then t1:=LinearAlgebra[Dimension](temp): Ind:=Vector(t1,0): for i from 1 to t1 do if temp[i]>=a then Ind[i]:=1: fi: od: return(Ind): else t1,t2:=LinearAlgebra[Dimension](temp): fi: Ind:=Matrix(t1,t2,0): for i from 1 to t1 do for j from 1 to t2 do if temp[i,j]>=a then Ind[i,j]:=1: fi: od: od: Ind: end: #NoDoubleCounting(temp): #Inputs: #temp: a matrix #Outputs: #a matrix identical to temp, but at the first non-zero value of each column, all values below are set to 0 NoDoubleCounting:=proc(temp) local T,Tnew, t1,t2,i,j: T:=LinearAlgebra[Transpose](temp): t1,t2:=LinearAlgebra[Dimension](T): Tnew:=Matrix(t1,t2,0): for i from 1 to t1 do for j from 1 to t2 do if T[i,j]<>0 then Tnew[i]:=convert([0$(j-1),T[i,j],0$(t2-j)],`Vector`): break: fi: if j=t2 then Tnew[i]:=convert([0$t2],`Vector`): fi: od: od: LinearAlgebra[Transpose](Tnew): end: #SetAToB(M,a,b): #Inputs: #M: a matrix #a, b: numeric values #Outputs: #A matrix identical to M except all instances of a have been replaced with b SetAToB:=proc(M,a,b) local Mnew,m1,m2,i,j: if type(M,`Vector`) then m1:=LinearAlgebra[Dimension](M): Mnew:=M: for i from 1 to m1 do if M[i]=a then Mnew[i]:=b: fi: od: return(Mnew): else m1,m2:=LinearAlgebra[Dimension](M): fi: Mnew:=M: for i from 1 to m1 do for j from 1 to m2 do if M[i,j]=a then Mnew[i,j]:=b: fi: od: od: Mnew: end: #FindFlatIndex(temp) #Inputs: #temp: a matrix #Outputs: #the flat index (ie temp_i,j given by i*j) of all nonzero elements FindFlatIndex:=proc(temp) local Ind,t1,i,t2,j: t1,t2:=LinearAlgebra[Dimension](temp): Ind:=[]: for i from 1 to t1 do for j from 1 to t2 do if temp[i,j]<>0 then Ind:=[op(Ind), i*j]: fi: od: od: convert(Ind, `Matrix`): end: #CountInstances(temp,k) #Inputs: #temp: a matrix #k: a numeric value #Outputs: #c: the number of times `k` appears in temp CountInstances:=proc(temp,k) local T,c,t1,i: T:=convert(temp,`list`): t1:=nops(T): c:=0: for i from 1 to t1 do if T[i]=k then c:=c+1: fi: od: c: end: #ConvertHard(i,n) #Inputs: #i,n: integers such that n>=i>0. #Outputs: #a vector given as a list with n entries and the ith entry 1, all others 0. ConvertHard:=proc(i,n): [0$(i-1),1,0$(n-i)]: end: #SortData(DATA,Uhard): #Inputs: #DATA: an m1xm2 matrix represented as a list of lists, with each row consisting of the vectors for each datapoint to be compared. #Uhard: an m1xn matrix given by a list of lists, where n is the number of clusters for the data to be sorted into. Entries of Uhard consist all of 0's and 1's, with there being a single 1 in each row such that Uhard_ij=1 implies that the vector DATA[i] has been sorted into cluster j. #Note that as we are working in list form, we can input U instead of Uhard and get the same results. The construction of Uhard is useful in code optimization because it allows for useage of sparse matrices. #Ouputs: #SDATA: a list of lists of lists, where each sublist is the list of vectors (represented as lists) assigned together to a cluster as indicated by Uhard. SortData:=proc(DATA,Uhard) local m1,n,i,c,SDATA: m1:=nops(DATA): n:=nops(Uhard[1]): SDATA:=[[]$n]: for i from 1 to m1 do c:=ListTools[FindMaximalElement](Uhard[i],position)[2]: SDATA[c]:=[op(SDATA[c]),DATA[i]]: od: SDATA: end: #NormalizeData(DATA) #Inputs: #DATA: a matrix represented as a list of lists, with each row consisting of the vectors for each datapoint to be compared. #Outputs: #a new matrix represented as a list of lists of a `normalized` version of DATA, where each column has been divided by the maximum value of the column. NormalizeData:=proc(DATA) local M,m,n,i: m:=nops(DATA): n:=nops(DATA[1]): M:=LinearAlgebra[Transpose](convert(DATA,`Matrix`)): for i from 1 to n do M[i]:=LinearAlgebra[Normalize](M[i]): od: M:=LinearAlgebra[Transpose](M): [seq(convert(M[i],`list`),i=1..m)]: end: ################################################################################# #################################################################################