精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕

IEMS 5730代做、c++,Java語言編程代寫

時間:2024-03-12  來源:  作者: 我要糾錯



IEMS 5730 Spring 2024 Homework 2
Release date: Feb 23, 2024
Due date: Mar 11, 2024 (Monday) 11:59:00 pm
We will discuss the solution soon after the deadline. No late homework will be accepted!
Every Student MUST include the following statement, together with his/her signature in the
submitted homework.
I declare that the assignment submitted on Elearning system is original
except for source material explicitly acknowledged, and that the same or
related material has not been previously submitted for another course. I
also acknowledge that I am aware of University policy and regulations on
honesty in academic work, and of the disciplinary guidelines and
procedures applicable to breaches of such policy and regulations, as
contained in the website
http://www.cuhk.edu.hk/policy/academichonesty/.
Signed (Student_________________________) Date:______________________________
Name_________________________________ SID_______________________________
Submission notice:
● Submit your homework via the elearning system.
● All students are required to submit this assignment.
General homework policies:
A student may discuss the problems with others. However, the work a student turns in must
be created COMPLETELY by oneself ALONE. A student may not share ANY written work or
pictures, nor may one copy answers from any source other than one’s own brain.
Each student MUST LIST on the homework paper the name of every person he/she has
discussed or worked with. If the answer includes content from any other source, the
student MUST STATE THE SOURCE. Failure to do so is cheating and will result in
sanctions. Copying answers from someone else is cheating even if one lists their name(s) on
the homework.
If there is information you need to solve a problem, but the information is not stated in the
problem, try to find the data somewhere. If you cannot find it, state what data you need,
make a reasonable estimate of its value, and justify any assumptions you make. You will be
graded not only on whether your answer is correct, but also on whether you have done an
intelligent analysis.
Submit your output, explanation, and your commands/ scripts in one SINGLE pdf file.
Q1 [20 marks + 5 Bonus marks]: Basic Operations of Pig
You are required to perform some simple analysis using Pig on the n-grams dataset of
Google books. An ‘n-gram’ is a phrase with n words. The dataset lists all n-grams present in
books from books.google.com along with some statistics.
In this question, you only use the Google books bigram (1-grams). Please go to Reference
[1] and [2] to download the two datasets. Each line in these two files has the following format
(TAB separated):
bigram year match_count volume_count
An example for 1-grams would be:
circumvallate 1978 335 91
circumvallate 1979 261 95
This means that in 1978(1979), the word "circumvallate" occurred 335(261) times overall,
from 91(95) distinct books.
(a) [Bonus 5 marks] Install Pig in your Hadoop cluster. You can reuse your Hadoop
cluster in IEMS 5730 HW#0 and refer to the following link to install Pig 0.17.0 over
the master node of your Hadoop cluster :
http://pig.apache.org/docs/r0.17.0/start.html#Pig+Setup
Submit the screenshot(s) of your installation process.
If you choose not to do the bonus question in (a), you can use any well-installed Hadoop
cluster, e.g., the IE DIC, or the Hadoop cluster provided by the Google Cloud/AWS [5, 6, 7]
to complete the following parts of the question:
(b) [5 marks] Upload these two files to HDFS and join them into one table.
(c) [5 marks] For each unique bigram, compute its average number of occurrences per
year. In the above example, the result is:
circumvallate (335 + 261) / 2 = 298
Notes: The denominator is the number of years in which that word has appeared.
Assume the data set contains all the 1-grams in the last 100 years, and the above
records are the only records for the word ‘circumvallate’. Then the average value is:
(335 + 261) / 2 = 298,
instead of
(335 + 261) / 100 = 5.96
(d) [10 marks] Output the 20 bigrams with the highest average number of occurrences
per year along with their corresponding average values sorted in descending order. If
multiple bigrams have the same average value, write down anyone you like (that is,
break ties as you wish).
You need to write a Pig script to perform this task and save the output into HDFS.
Hints:
● This problem is very similar to the word counting example shown in the lecture notes
of Pig. You can use the code there and just make some minor changes to perform
this task.
Q2 [20 marks + 5 bonus marks]: Basic Operations of Hive
In this question, you are asked to repeat Q1 using Hive and then compare the performance
between Hive and Pig.
(a) [Bonus 5 marks] Install Hive on top of your own Hadoop cluster. You can reuse your
Hadoop cluster in IEMS 5730 HW#0 and refer to the following link to install Hive
2.3.8 over the master node of your Hadoop cluster.
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
Submit the screenshot(s) of your installation process.
If you choose not to do the bonus question in (a), you can use any well-installed Hadoop
cluster, e.g., the IE DIC, or the Hadoop cluster provided by the Google Cloud/AWS [5, 6, 7].
(b) [20 marks] Write a Hive script to perform exactly the same task as that of Q1 with
the same datasets stored in the HDFS. Rerun the Pig script in this cluster and
compare the performance between Pig and Hive in terms of overall run-time and
explain your observation.
Hints:
● Hive will store its tables on HDFS and those locations needs to be bootstrapped:
$ hdfs dfs -mkdir /tmp
$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp
$ hdfs dfs -chmod g+w /user/hive/warehouse
● While working with the interactive shell (or otherwise), you should first test on a small
subset of the data instead of the whole data set. Once your Hive commands/ scripts
work as desired, you can then run them up on the complete data set.
Q3 [30 marks + 10 Bonus marks]: Similar Users Detection in
the MovieLens Dataset using Pig
Similar user detection has drawn lots of attention in the machine learning field which is
aimed at grouping users with similar interests, behaviors, actions, or general patterns. In this
homework, you will implement a similar-users-detection algorithm for the online movie rating
system. Basically, users who rate similar scores for the same movies may have common
tastes or interests and be grouped as similar users.
To detect similar users, we need to calculate the similarity between each user pair. In this
homework, the similarity between a given pair of users (e.g. A and B) is measured as the
total number of movies both A and B have watched divided by the total number of
movies watched by either A or B. The following is the formal definition of similarity: Let
M(A) be the set of all the movies user A has watched. Then the similarity between user A
and user B is defined as:
………..(**) 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝐴, 𝐵) =
|𝑀(𝐴)∩𝑀(𝐵)|
|𝑀(𝐴)∪𝑀(𝐵)|
where |S| means the cardinality of set S.
(Note: if |𝑀(𝐴)∪𝑀(𝐵)| = 0, we set the similarity to be 0.)
The following figure illustrates the idea:
Two datasets [3][4] with different sizes are provided by MovieLens. Each user is represented
by its unique userID and each movie is represented by its unique movieID. The format of the
data set is as follows:
<userID>, <movieID>
Write a program in Pig to detect the TOP K similar users for each user. You can use the
cluster you built for Q1 and Q2 or you can use the IE DIC or one provided by the Google
Cloud/AWS [5, 6, 7].
(a) [10 marks] For each pair of users in the dataset [3] and [4], output the number of
movies they have both watched.
For your homework submission, you need to submit i) the Pig script and ii) the
list of the 10 pairs of users having the largest number of movies watched by
both users in the pair within the corresponding dataset. The format of your
answer should be as follows:
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:COMP 315代寫、Java程序語言代做
  • 下一篇:代做CSCI 2525、c/c++,Java程序語言代寫
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕
    <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
    <ul id="e4iaa"></ul>
    <blockquote id="e4iaa"><tfoot id="e4iaa"></tfoot></blockquote>
    • <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
      <ul id="e4iaa"></ul>
      <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp><ul id="e4iaa"></ul>
      <ul id="e4iaa"></ul>
      <th id="e4iaa"><menu id="e4iaa"></menu></th>
      久久综合九色综合97婷婷女人| 国产美女娇喘av呻吟久久| 精品精品国产高清一毛片一天堂| 欧美亚洲综合久久| 色婷婷精品大视频在线蜜桃视频| 99re亚洲国产精品| 日本道精品一区二区三区| 欧美在线视频你懂得| 欧美人动与zoxxxx乱| 欧美一区二区视频在线观看| 精品国产三级电影在线观看| 国产欧美一区二区在线| 中文字幕日韩精品一区| 亚洲综合一区二区三区| 日韩成人精品在线观看| 国内不卡的二区三区中文字幕| 韩国精品主播一区二区在线观看 | 色噜噜夜夜夜综合网| 欧美性受xxxx黑人xyx| 7777精品伊人久久久大香线蕉最新版 | 欧美激情在线观看视频免费| 亚洲国产精品激情在线观看| 亚洲六月丁香色婷婷综合久久 | 91麻豆精品在线观看| 欧美性三三影院| 精品国产第一区二区三区观看体验 | 久久av中文字幕片| 成人国产一区二区三区精品| 欧美视频在线观看一区二区| 日韩精品最新网址| 亚洲色图欧美激情| 精品一区二区三区在线播放视频| 99精品视频在线免费观看| 欧美高清www午色夜在线视频| 国产偷国产偷亚洲高清人白洁| 亚洲婷婷国产精品电影人久久| 日韩福利视频导航| aa级大片欧美| 精品少妇一区二区三区日产乱码| 一区二区中文视频| 久久成人麻豆午夜电影| 欧美性色欧美a在线播放| 久久久av毛片精品| 亚洲成在线观看| 成人app下载| 精品国产免费视频| 日韩有码一区二区三区| 一本久久a久久免费精品不卡| 日韩欧美二区三区| 婷婷成人综合网| 色哟哟一区二区在线观看| 久久香蕉国产线看观看99| 亚洲成a人片在线不卡一二三区| 成人精品视频网站| 国产三级精品视频| 久久99久久久欧美国产| 欧美日韩色一区| 一区二区三区小说| 91免费看视频| 亚洲欧洲一区二区在线播放| 国产成人综合在线播放| 精品国产一区二区在线观看| 丝袜亚洲另类欧美| 3d动漫精品啪啪一区二区竹菊| 最新日韩在线视频| 97久久超碰国产精品| 亚洲视频在线一区观看| 99综合电影在线视频| 国产精品激情偷乱一区二区∴| 国产成人免费视频网站| 亚洲国产高清在线| 成人av网址在线| 国产精品福利一区| 色婷婷激情综合| 亚洲午夜免费视频| 欧美顶级少妇做爰| 蜜桃av噜噜一区| 日韩色视频在线观看| 精品一区二区三区在线观看| 欧美电影免费观看高清完整版在线观看| 日韩国产欧美在线视频| 欧美一区二区视频在线观看2022| 麻豆久久久久久久| 精品国产髙清在线看国产毛片 | 日本韩国视频一区二区| 亚洲精品老司机| 欧美日韩精品一二三区| 日韩成人免费电影| 久久久久久久性| 丁香桃色午夜亚洲一区二区三区| 国产精品欧美一区二区三区| 日本丶国产丶欧美色综合| 婷婷六月综合网| 欧美成人激情免费网| 高清成人免费视频| 亚洲狠狠爱一区二区三区| 日韩欧美一区二区视频| 丰满岳乱妇一区二区三区| 亚洲乱码日产精品bd| 制服丝袜中文字幕一区| 国产乱淫av一区二区三区 | 欧美嫩在线观看| 国产原创一区二区| 一区二区在线观看视频| 日韩欧美精品三级| 91在线观看污| 美女视频一区二区| 综合激情成人伊人| 亚洲精品一区在线观看| 在线观看国产91| 国产老女人精品毛片久久| 樱花草国产18久久久久| 久久久综合视频| 欧美日韩在线免费视频| 福利一区在线观看| 久久国产精品无码网站| 亚洲一二三区在线观看| 国产亚洲一区二区三区在线观看| 欧美日韩一卡二卡三卡| 成人激情午夜影院| 国产专区欧美精品| 日本大胆欧美人术艺术动态| 中文字幕视频一区| 国产欧美一区二区精品性| 日韩亚洲欧美成人一区| 欧美在线观看视频一区二区 | 亚洲四区在线观看| 国产丝袜美腿一区二区三区| 日韩限制级电影在线观看| 在线视频欧美精品| 99久久婷婷国产综合精品电影| 久久电影国产免费久久电影| 午夜精品成人在线视频| 亚洲一区二区成人在线观看| 亚洲欧美自拍偷拍色图| 中文成人av在线| 国产欧美日韩在线| 国产亚洲精品aa午夜观看| 日韩欧美一级在线播放| 欧美一级欧美三级在线观看| 欧美日韩国产大片| 欧美精品粉嫩高潮一区二区| 欧美日韩中文精品| 欧美日韩国产成人在线免费| 在线精品亚洲一区二区不卡| 91欧美一区二区| 97精品国产露脸对白| 91在线观看地址| 91女神在线视频| 欧美亚洲一区三区| 欧美日本精品一区二区三区| 欧美日韩国产综合一区二区| 欧美精品在线观看一区二区| 69堂成人精品免费视频| 日韩一级精品视频在线观看| 精品国产三级电影在线观看| 337p粉嫩大胆色噜噜噜噜亚洲| 精品99一区二区| 国产人妖乱国产精品人妖| 国产精品色婷婷| 亚洲激情六月丁香| 五月天久久比比资源色| 欧美aa在线视频| 国产麻豆成人传媒免费观看| 成人黄色片在线观看| 在线观看日韩国产| 欧美福利电影网| 久久久久久一级片| 中文字幕一区二区三区av| 亚洲综合色在线| 麻豆成人免费电影| 成人精品视频一区| 欧美精品色综合| 久久久精品综合| 一区二区三区在线视频播放| 日韩国产精品久久久久久亚洲| 久草中文综合在线| 色综合激情五月| 精品国产免费人成在线观看| 国产精品久久影院| 免费人成精品欧美精品| 成人午夜私人影院| 欧美肥妇毛茸茸| 亚洲国产高清在线观看视频| 亚洲综合色噜噜狠狠| 国产剧情一区二区三区| 欧美在线观看一区| 国产精品欧美久久久久一区二区| 亚洲777理论| 91视视频在线直接观看在线看网页在线看| 欧美三级乱人伦电影| 中文字幕乱码久久午夜不卡 | 国产亚洲欧美色| 亚洲一区二区精品久久av| 国产馆精品极品| 欧美另类高清zo欧美| 国产精品久久久久一区| 精品在线播放午夜| 欧美疯狂性受xxxxx喷水图片| 自拍av一区二区三区|