「Webデータマイニング」を独学で勉強するためのYoutube動画。ネット上のビッグデータを解析抽出してDB化するための技術手法を解説
Webデータのマイニングについて勉強するためのYoutube動画。
テキストマイニングの基礎や,具体的なアルゴリズムも説明されている。
Googleのような検索システムがどのように構築されているか
といった,実用的な応用例も紹介されている。
動画の情報:
「Oresoft LWC」さんによる
「Web Data Mining」シリーズの動画(英語)
1~50
- 1. WDM 1:What is Data Mining
- 2. WDM 2: Structured Data, Unstructured data and Information Retrieval
- 3. WDM 3: Term-Document Incidence Matrix (1)
- 4. WDM 4: Term-Document Incidence Matrix (2)
- 5. WDM 5: Inverted Index
- 6. WDM 6:Tradeoffs in implementing an Inverted Index
- 7. WDM 7: Processing AND OR NOT queries
- 8. WDM 8: Overview of Index Construction Pipeline
- 9. WDM 9: Query optimization using Document Frequency 1
- 10. WDM 10: Query optimization using Document Frequency 2
- 11. WDM 11: Boolean Retrieval Model
- 12. WDM 12: Example of a Boolean Retrieval Model
- 13. WDM 13: Limitations of a Boolean Retrieval Model
- 14. WDM 14: How an User Interact with IR System
- 15. WDM 15: How to evaluate performance of an IR System
- 16. WDM 16: google zeitgeist
- 17. WDM 17: Parsing Documents and Issues Associated with it
- 18. WDM 18: Tokenization Process in an IR System
- 19. WDM 19: Normalization to Terms
- 20. WDM 20: Faster Postings Merges With Skip Pointers
- 21. WDM 21: How to Handle Phrase Query
- 22. WDM 22: Phrase Query Using Positional Index
- 23. WDM 23: How to handle proximity query
- 24. WDM 24: Discussion on Positional Index Size
- 25. WDM 25: Dictionary Data Structure Implementation
- 26. WDM 26: Wild card queries
- 27. WDM 27:Questions on Wild Card Queries
- 28. WDM 28: Wild Card Query Handling Using Permuterm Index
- 29. WDM 29: Wild Card Query Handling Using K-Gram Index
- 30. WDM 30: Soundex Algorithm
- 31. WDM 31: Spelling Correction Techniques in an IR System
- 32. WDM 33: Questions on Soundex Algorithm
- 33. WDM 34: Spelling Correction (Part 2)
- 34. WDM 35: Intro To Dynamic Programming
- 35. WDM 36: How To Calculate Edit Distance Between Two Strings
- 36. WDM 37: Spelling Correction Using Weighted Edit Distance
- 37. WDM 38: Spelling Correction Using Ngram Overlap Technique
- 38. WDM 39: Calculating Jaccard Coefficient ( An Example)
- 39. WDM 40: Context Sensitive Spell Correction
- 40. WDM 41: Introduction to Index Construction
- 41. WDM 42: Index Construction Using InMemory Sorting
- 42. WDM 43: Index Construction Using Blocked Sort Based Indexing Algorithm
- 43. WDM 44: Index Construction Using Single Pass In Memory Indexing
- 44. WDM 45: Introduction To Distributed Indexing
- 45. WDM 46: How To build distributed indexes
- 46. WDM 47: Q & A on Distributed Index
- 47. WDM 48: Map Reduce
- 48. WDM 49: Dynamic indexing using naive approach
- 49. WDM 50: Dynamic indexing using logarithimic merge
- 50. WDM 51: Issues With Multiple Indexes
51~100
- 51. WDM 52: Why do we compress indexes
- 52. WDM 53: Important Statistics about RCV Collection
- 53. WDM 54: Various Dictionary Compression Techniques
- 54. WDM 55:Various Dictionary Compression Techniques Part2
- 55. WDM 56: Various Posting Compression Techniques
- 56. WDM 57: Ranked Retrieval Model
- 57. WDM 58: Jaccard Score
- 58. WDM 59: Term Frequency Weighing And Bag Of Words Model
- 59. WDM 60: Inverse Document Frequency
- 60. WDM 61: TF-IDF Score
- 61. WDM 62: Documents AS TF-IDF Vectors
- 62. WDM 63: Length Normalization
- 63. 64 Cosine Similarity Example
- 64. WDM 65: Computing Cosine Scores On Index
- 65. WDM 66: Variants of TF IDF Weights
- 66. WDM 67: Term at a Time Scoring
- 67. WDM 68: Efficient Cosine Ranking
- 68. WDM 69: Generic Approach For Speeding up Cosine_Similarity
- 69. WDM 70: Index Elimination
- 70. WDM 71: Champion Lists
- 71. WDM 72: Static Quality Score
- 72. WDM 73: High And Low Lists
- 73. WDM 74: Impact Ordered Posting
- 74. WDM 75: Cluster Pruning
- 75. WDM 76: Parametric Zone Tired Index
- 76. WDM 77: Query Term Proximity And Query Parsing
- 77. WDM 78: How A Search Engine Works
- 78. WDM 79: Performance of a Search Engine Part 1
- 79. WDM 80: Performance of a Search Engine Part 2
- 80. WDM 81: Performance of a Search Engine Part 3
- 81. WDM 82: Performance of a Search Engine Part 4
- 82. WDM 83: Performance of a Search Engine Part 5
- 93. 84 Class Discussion On ECommerce Vs Traditional Businesses
- 91. 85 Various Pricing Models For Online Advertisement
- 94. 86 AdWords AdSense
- 92. 87 SEM And SEO
- 90. 88 Introduction to Classification
- 89. 89 Document classification
- 88. 90 Manual Classification Methods
- 87. 91 Naive Bayes Classifiers
- 84. 92 What is a Reputation System
- 86. 93 Examples of Reputation System
- 85. 94 Limitations of Reputation System
101~最後
- 105. 102 Association Rule Introduction
- 102. 103 Market Basket Model and Frequent Item Sets
- 109. 104 A formal approach to Association Rules
- 103. 105 How to find association Rules
- 100. 106 Storage Considerations for Market Basaket
- 101. 107 Memory Bottleneck in Storage of Market Basket
- 108. 108 A Naive Algorithm to discover Association Rules Part1
- 107. 109 A Naive Algorithm to discover Association Rules Part2
- 106. 110 A Priori Algorithm
- 104. 111 Extension of A Priori Algorithm
- 126. WDM 112: How a Web Crawler Works
- 124. WDM 113: Complications in Crawling
- 122. WDM 114: Advance Crawler Architecture
- 125. WDM 115: URL Frontier
- 123. WDM 116: URL Frontier Using Mercator Scheme
- 120. WDM 116: Various Classification Methods
- 121. WDM 117: Bayes Rules Of Text Classification
- 119. WDM 118: Example of Multivariate Bernouli Model
- 118. WDM 119: Second Version of Naive Bayes
- 117. WDM 120: Example of Second Version of Naive Bayes
- 113. WDM 121: Rocchio Algorithms
- 115. WDM 122: K Nearest Neighbor Algorithms
- 116. WDM 123: Discussion on K Nearest Neighbor
- 114. WDM 124: Proof of Rocchio's Algorithm as linear classifier
- 111. WDM 125: Worked out Example On Rocchio Algorithms
- 112. WDM 126: Examples On Bigram Index
- 110. WDM 127: Final Thoughts and Future Action Items
関連記事:
Youtube動画で「Googleページランクの数理」(線形代数と行列による説明)を理解しよう
http://computer-technology.hateblo.jp/entry/20140528/p2
「データマイニングと機械学習」を勉強できる大学のYoutube講義動画。ビッグデータ解析と学習モデル
http://computer-technology.hateblo.jp/entry/20150901/p1
「パターン認識と機械学習入門」の勉強会のYoutube動画
http://computer-technology.hateblo.jp/entry/20140528/p3
暗号理論のわかりやすい初歩的な入門の動画
http://computer-technology.hateblo.jp/entry/20140519/p3
「形式手法」(形式仕様)に入門するためのYoutube動画
http://computer-technology.hateblo.jp/entry/20140623/p1