Plagiarism Detection as a Problem of Machine Learning
description
Transcript of Plagiarism Detection as a Problem of Machine Learning
![Page 1: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/1.jpg)
Plagiarism Detection as a Plagiarism Detection as a Problem of Machine LearningProblem of Machine Learning
Academician Yuri I. ZhuravlevCorrespondent member of RAS Konstantin V. Rudakov
Gleb V. Nikitov
Computing Center of Russian Academy of Sciences
Forecsys Corporation
![Page 2: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/2.jpg)
About the problemAbout the problem
Detect citing in students’ papers
Do it quickly and conveniently Do it qualitatively and with
substantiation
![Page 3: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/3.jpg)
Decisions
Turnitin Mydropbox www.antiplagiat.ru
![Page 4: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/4.jpg)
`Working scheme
Paper
Instructor AntiplagiatCollection ofdocuments
![Page 5: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/5.jpg)
Searching domainSearching domainInternet
![Page 6: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/6.jpg)
Already usingAlready usingHigher School of EconomicsMoscow Institute of Economics, Management and LawMoscow Pedagogical State UniversityMoscow Municipal Psychological and Pedagogical InstituteNizhni Novgorod State UniversityAcademy of Budget and Treasury of the Russian Ministry of Finance
![Page 7: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/7.jpg)
Non-educational use
Higher Certifying Commission Russian State Library (ex-named
after Lenin)
![Page 8: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/8.jpg)
NegotiationsNegotiations
Moscow State University Moscow Physical and Technical
Institute Russian Academy of Justice International Academy of Enterprise
![Page 9: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/9.jpg)
Quality and PerformanceQuality and Performance
Leading positions in operating speed not affecting quality of the results
70 thousands of registered users Generating about 20 thousands
originality reports every day Continual improvement of searching
algorithms and expanding functionality
![Page 10: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/10.jpg)
Plagiarism, what is it?
![Page 11: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/11.jpg)
Formulation of problem Permissible objects:
Descriptive functions:
Fixed set of functions:
1 2| {1,..., } , | 1, ,i i iS i N Fr Fr i N
:| DDDescr
nSSDD |)()( 00
![Page 12: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/12.jpg)
Formulation of problem
The problem:
Initial information:
Final information:
fiA :
)(0 Di
1,...,1,0 kf
![Page 13: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/13.jpg)
Formulation of problem
Precedent information:
Precedent conditions:
1 1, ,..., , , , , 1,q q j jfS An S An где S An j q
0
1,...,
j j
qj A D S An
![Page 14: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/14.jpg)
Formulation of problem
Transitive and reflective relation :
Example:
0 0 0 0 0 01 2 1 2 1 2 1 2
1,...,, , ,i i
NFr Fr and Fr Fr i Fr Fr Fr Fr
)(),(min
,,
21
2121 FrLFrL
FrFrWLFrFr
22
12
12
11
22
21
12
11 ,,,, FrFrFrFrFrFrFrFr
![Page 15: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/15.jpg)
Formulation of problem
Additional conditions:
1 1 2 2
1 1 2 2
1 2 1 2 1 21,...,
0 01 2 1 2
, , ,
, ,
i i i i
N
i i i i
i i Fr Fr Fr Fr
A D Fr Fr A D Fr Fr
![Page 16: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/16.jpg)
Criteria
Solvability criteriaFor existence of the correct algorithm A it is
necessary and sufficient that the following conditions are met:
1 2 1 1 2 2 1 20 01 2 1 2
{1,..., } {1,..., ], : , : &j j i j i j i i
q Nj j An An i i S S S S D S D S
2121:21},...,1{
jjjj
qAnAnSSjj
![Page 17: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/17.jpg)
Criteria
Regularity:Definition (according to Zhuravlev). The problem Z is regular if all the problems with arbitrary final information are simultaneously solvable
Regularity criteria:For a problem to be regular it is necessary and
sufficient that the following conditions are met:
2121
},...,1{
jj
qSSjj
212211 0021
},...,1{21
},...,1{&:,: iijiji
NqSDSDSSSSiijj
![Page 18: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/18.jpg)
Criteria
Monotonous solvability criteria:For monotonous solvability of the problem it is
necessary and sufficient that the conditions of solvability criteria are met and are also met the following conditions:
1 2 1 1 2 2
1 2
1 2 1 2{1,..., } {1,..., }
0 0
: &j j i j i j
q N
i i
j j An An i i S S S S
D S D S
![Page 19: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/19.jpg)
Criteria
Monotonous regularity criteria:For monotonous solvability of the problem it is
necessary and sufficient that are met the following conditions:
2121
},...,1{
jj
qSSjj
212211 0021
},...,1{21
},...,1{||&:,, iijiji
NqSDSDSSSSiijj
![Page 20: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/20.jpg)
Criteria
Supercompleteness:The family of algorithms M is called supercomplete in the described class of problems if for each problem Z from the set of solvable problems there exist in M at least one correct algorithm.
S] [Z Z
![Page 21: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/21.jpg)
Criteria
Completeness:The family of algorithms M is called complete in the described class of problems if for each problem Z from the set of regular problems there exist in M at least one correct algorithm.
R] [Z Z
![Page 22: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/22.jpg)
Criteria
Supercompleteness criteria:For the family of algorithmic
operators to be supercomplete it is necessary and sufficient that the following conditions are met:
0M
1 2 1 20 0
1 2{1,..., }
, : :i i i i
Ni i S S B B D S B D S
0M
![Page 23: Plagiarism Detection as a Problem of Machine Learning](https://reader033.fdocuments.in/reader033/viewer/2022061605/56814cfc550346895dba1aa4/html5/thumbnails/23.jpg)
Criteria
Completeness criteria:For the family of algorithmic
operators to be complete it is necessary and sufficient that the following conditions are met:
0M
1 2 2 1 1 20 0
1 2{1,..., }, : & :i i i i i i
Ni i S S S S B B D S B D S
0M