Plagiarism Detection for Code Snippet Questions

Plagiarism Detection for Code Snippet Questions

  1. 1. How do we at Mercer | Mettl calculate Plagiarism for our coding questions? 
     
    At Mercer | Mettl, we use MOSS (Measure of Software Similarity) to detect plagiarism in our coding assessments. MOSS is an algorithm which tokenizes a candidate’s code and compares it with all other candidates' codes to identify areas which might have a potential overlap. Often candidates introduce white spaces or change variable names, formats etc to deceive plagiarism detection, but MOSS overcomes this problem very easily as the overall structure of a program remains unchanged, while the number of tokens and line matches between the codes is still the same.  
     
    While flagging for plagiarism, the submission of a candidate is compared across all the previous submissions by other candidates in a particular assessment. 
     
    The plagiarism percentage reflected in a candidate’s report is an indicator that the candidate copied code deliberately without attribution, and while MOSS automatically detects program similarity, it has no way of knowing why the codes are similar.  
    However, we always recommend the user to go and look at the parts of the code that MOSS highlights and decide whether there exists plagiarism or not. 
     
    We at Mercer | Mettl believe, that it is incorrect to completely rely on the plagiarism percentage calculated. These scores are useful for judging the relative amount of matching between different pairs of programs and for more easily seeing which pairs of programs stick out with unusual amounts of matching. However, these scores are certainly not a proof of plagiarism, instead just an indication of similarity. To verify the same, someone must still look at the code. Additionally, the plagiarism scores are not labelled to select or reject a candidate and other parameters such as test-cases, proctoring and browsing tolerance should also be considered. 
     
  2. 2. For which candidates is the plagiarism percentage not calculated in a coding assessment? 
     
    We do not calculate plagiarism for candidates who have scored zero marks on a coding question in an assessment. Additionally, we only calculate plagiarism for candidates who have written a substantial amount of code for plagiarism to be calculated upon. 
     

  1. 3. What all languages do we support the calculation of plagiarism percentage for? 


  2. We currently support plagiarism calculation for C, C18, CPP, CPP17, Java6, Java7, Java8, Java11, Java17, PHP, PHP8, Python2, Python3, JavaScript8, JavaScript19 and Node.  For all other languages, the plagiarism score is reflected as N/A. 

 

  1. 4. How does Mercer | Mettl tag Plagiarism percentage to be Acceptable or Not Acceptable? 
     
    By default, the not-acceptable plagiarism percentage is fixed at 60% i.e., candidates scoring plagiarism percentage >=60% in any of the coding questions in an assessment are tagged as Not Acceptable. 
     
    However, the not-acceptable plagiarism percentage for an assessment can be customized if required. Please refer to this [link] for the same.  
     

  1. 5. Can we configure the not-acceptable plagiarism percentage? If yes, then how? 
     
    Yes, the not-acceptable plagiarism can be configured at the customer's end w.r.t each assessment. 
    Please refer to this [link] for the same.  
     
    By default, the not-acceptable plagiarism percentage is fixed at 60%.  
     

  1. 6. What is meant by the various labels provided in the report w.r.t plagiarism percentage? 
     
    At present, a candidate’s plagiarism percentage can be labelled into one of the following: 
     

  1. Not Acceptable: When a candidate’s plagiarism percentage is greater than or equal to the not-acceptable plagiarism percentage.  
    By default, the not-acceptable score is 60% but can be customised according to the customer's requirement. 
     

  1. Acceptable: When a candidate’s plagiarism percentage is less than the not acceptable plagiarism percentage.  
    By default, the not-acceptable score is 60% but can be customised according to the customer's requirement. 
     

  1. No Matches: When a candidate’s plagiarism percentage is 0% as no potential matches have been found for that candidate’s code.  
     

  1. NA: When plagiarism for a particular candidate has not been calculated because of one or more of the following reasons:  

    1. The candidate has scored zero marks.

    2. The candidate has not written a substantial amount of code for plagiarism to be calculated upon.

    3. Our algorithm does not support the language the candidate has coded in. 

    4. Plagiarism was not enabled when the report was generated.