PLEASE READ AND UNDERSTAND THE ATTACHED PDF FILE PRIOR TO SUBMITTING A BID
This project can be completed using either java or perl.
The java/perl code must include extensive comments, so that I can learn.
An attempt was made to begin this project in java. About 125 lines of code are included at the end of the attached pdf file. This java code may be helpful, and built upon. Or, it could be of no value, and better to begin coding from scratch.
In the attached pdf document, I manually show each step, using the first twelve records in the data set. You can follow along, and see what is happening at each step.
Input data file is a vertical-bar delimited text file, containing 5,298 records, containing 67 fields
Step 1: Data will need to be sorted (ascending) based upon the values in the first two fields
Step 2: Delete duplicate values contained within fields 5 to 67
Step 3: Group-by “categories” / Sum “cosine”(s)
Step 4: If there are values contained in Fields “categories” for “rank” = 000001, and if any of those “categories” values match a “category” value in “rank” = 999999, then populate the “rank” = 000001 “cosine” with the value of the “cosine” in “rank” = 999999
Step 5: For each of the 1,766 records, sort the categories in descending order, based upon the value of “cosine”
Step 6: Calculate the number of categories that are above, equal to, or below, a USER_DEFINED_THRESHHOLD
Step 7: Calculate some statistics, if and only if, “rank” = 000001, greater_than_user_defined_threshhold PLUS equal_to_user_defined_threshhold PLUS less_than_user_defined_threshhold IS GREATER THAN ZERO
23 Freelancer bieten im Durchschnitt $60 für diesen Job
I am Senior Java/Scala Developer with 15 years of experience in design and development with strong problem solving skills. Code Samples [login to view URL] CV [login to view URL]
Java and Perl are my top 2 languages. I can deliver in either with well-commented code. Your project is straight-forward, and I'm trying to get back into Freelancer, so I'm going for a low bid.
19 years' experience mostly working in Java. Many completed data management utilities like this. Native English speaker, so we can be sure not to have any trouble with communication.