精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕

IEMS 5730代做、c++,Java語言編程代寫

時間:2024-03-12  來源:  作者: 我要糾錯



IEMS 5730 Spring 2024 Homework 2
Release date: Feb 23, 2024
Due date: Mar 11, 2024 (Monday) 11:59:00 pm
We will discuss the solution soon after the deadline. No late homework will be accepted!
Every Student MUST include the following statement, together with his/her signature in the
submitted homework.
I declare that the assignment submitted on Elearning system is original
except for source material explicitly acknowledged, and that the same or
related material has not been previously submitted for another course. I
also acknowledge that I am aware of University policy and regulations on
honesty in academic work, and of the disciplinary guidelines and
procedures applicable to breaches of such policy and regulations, as
contained in the website
http://www.cuhk.edu.hk/policy/academichonesty/.
Signed (Student_________________________) Date:______________________________
Name_________________________________ SID_______________________________
Submission notice:
● Submit your homework via the elearning system.
● All students are required to submit this assignment.
General homework policies:
A student may discuss the problems with others. However, the work a student turns in must
be created COMPLETELY by oneself ALONE. A student may not share ANY written work or
pictures, nor may one copy answers from any source other than one’s own brain.
Each student MUST LIST on the homework paper the name of every person he/she has
discussed or worked with. If the answer includes content from any other source, the
student MUST STATE THE SOURCE. Failure to do so is cheating and will result in
sanctions. Copying answers from someone else is cheating even if one lists their name(s) on
the homework.
If there is information you need to solve a problem, but the information is not stated in the
problem, try to find the data somewhere. If you cannot find it, state what data you need,
make a reasonable estimate of its value, and justify any assumptions you make. You will be
graded not only on whether your answer is correct, but also on whether you have done an
intelligent analysis.
Submit your output, explanation, and your commands/ scripts in one SINGLE pdf file.
Q1 [20 marks + 5 Bonus marks]: Basic Operations of Pig
You are required to perform some simple analysis using Pig on the n-grams dataset of
Google books. An ‘n-gram’ is a phrase with n words. The dataset lists all n-grams present in
books from books.google.com along with some statistics.
In this question, you only use the Google books bigram (1-grams). Please go to Reference
[1] and [2] to download the two datasets. Each line in these two files has the following format
(TAB separated):
bigram year match_count volume_count
An example for 1-grams would be:
circumvallate 1978 335 91
circumvallate 1979 261 95
This means that in 1978(1979), the word "circumvallate" occurred 335(261) times overall,
from 91(95) distinct books.
(a) [Bonus 5 marks] Install Pig in your Hadoop cluster. You can reuse your Hadoop
cluster in IEMS 5730 HW#0 and refer to the following link to install Pig 0.17.0 over
the master node of your Hadoop cluster :
http://pig.apache.org/docs/r0.17.0/start.html#Pig+Setup
Submit the screenshot(s) of your installation process.
If you choose not to do the bonus question in (a), you can use any well-installed Hadoop
cluster, e.g., the IE DIC, or the Hadoop cluster provided by the Google Cloud/AWS [5, 6, 7]
to complete the following parts of the question:
(b) [5 marks] Upload these two files to HDFS and join them into one table.
(c) [5 marks] For each unique bigram, compute its average number of occurrences per
year. In the above example, the result is:
circumvallate (335 + 261) / 2 = 298
Notes: The denominator is the number of years in which that word has appeared.
Assume the data set contains all the 1-grams in the last 100 years, and the above
records are the only records for the word ‘circumvallate’. Then the average value is:
(335 + 261) / 2 = 298,
instead of
(335 + 261) / 100 = 5.96
(d) [10 marks] Output the 20 bigrams with the highest average number of occurrences
per year along with their corresponding average values sorted in descending order. If
multiple bigrams have the same average value, write down anyone you like (that is,
break ties as you wish).
You need to write a Pig script to perform this task and save the output into HDFS.
Hints:
● This problem is very similar to the word counting example shown in the lecture notes
of Pig. You can use the code there and just make some minor changes to perform
this task.
Q2 [20 marks + 5 bonus marks]: Basic Operations of Hive
In this question, you are asked to repeat Q1 using Hive and then compare the performance
between Hive and Pig.
(a) [Bonus 5 marks] Install Hive on top of your own Hadoop cluster. You can reuse your
Hadoop cluster in IEMS 5730 HW#0 and refer to the following link to install Hive
2.3.8 over the master node of your Hadoop cluster.
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
Submit the screenshot(s) of your installation process.
If you choose not to do the bonus question in (a), you can use any well-installed Hadoop
cluster, e.g., the IE DIC, or the Hadoop cluster provided by the Google Cloud/AWS [5, 6, 7].
(b) [20 marks] Write a Hive script to perform exactly the same task as that of Q1 with
the same datasets stored in the HDFS. Rerun the Pig script in this cluster and
compare the performance between Pig and Hive in terms of overall run-time and
explain your observation.
Hints:
● Hive will store its tables on HDFS and those locations needs to be bootstrapped:
$ hdfs dfs -mkdir /tmp
$ hdfs dfs -mkdir /user/hive/warehouse
$ hdfs dfs -chmod g+w /tmp
$ hdfs dfs -chmod g+w /user/hive/warehouse
● While working with the interactive shell (or otherwise), you should first test on a small
subset of the data instead of the whole data set. Once your Hive commands/ scripts
work as desired, you can then run them up on the complete data set.
Q3 [30 marks + 10 Bonus marks]: Similar Users Detection in
the MovieLens Dataset using Pig
Similar user detection has drawn lots of attention in the machine learning field which is
aimed at grouping users with similar interests, behaviors, actions, or general patterns. In this
homework, you will implement a similar-users-detection algorithm for the online movie rating
system. Basically, users who rate similar scores for the same movies may have common
tastes or interests and be grouped as similar users.
To detect similar users, we need to calculate the similarity between each user pair. In this
homework, the similarity between a given pair of users (e.g. A and B) is measured as the
total number of movies both A and B have watched divided by the total number of
movies watched by either A or B. The following is the formal definition of similarity: Let
M(A) be the set of all the movies user A has watched. Then the similarity between user A
and user B is defined as:
………..(**) 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝐴, 𝐵) =
|𝑀(𝐴)∩𝑀(𝐵)|
|𝑀(𝐴)∪𝑀(𝐵)|
where |S| means the cardinality of set S.
(Note: if |𝑀(𝐴)∪𝑀(𝐵)| = 0, we set the similarity to be 0.)
The following figure illustrates the idea:
Two datasets [3][4] with different sizes are provided by MovieLens. Each user is represented
by its unique userID and each movie is represented by its unique movieID. The format of the
data set is as follows:
<userID>, <movieID>
Write a program in Pig to detect the TOP K similar users for each user. You can use the
cluster you built for Q1 and Q2 or you can use the IE DIC or one provided by the Google
Cloud/AWS [5, 6, 7].
(a) [10 marks] For each pair of users in the dataset [3] and [4], output the number of
movies they have both watched.
For your homework submission, you need to submit i) the Pig script and ii) the
list of the 10 pairs of users having the largest number of movies watched by
both users in the pair within the corresponding dataset. The format of your
answer should be as follows:
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:COMP 315代寫、Java程序語言代做
  • 下一篇:代做CSCI 2525、c/c++,Java程序語言代寫
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕
    <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
    <ul id="e4iaa"></ul>
    <blockquote id="e4iaa"><tfoot id="e4iaa"></tfoot></blockquote>
    • <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
      <ul id="e4iaa"></ul>
      <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp><ul id="e4iaa"></ul>
      <ul id="e4iaa"></ul>
      <th id="e4iaa"><menu id="e4iaa"></menu></th>
      中文字幕在线观看精品| 日本在线视频中文字幕| 亚洲GV成人无码久久精品| 免费又黄又爽又色的视频| 99热在线观看精品| 亚洲精品久久久久久无码色欲四季| 美女av免费看| 99免费在线视频| 中文字幕在线观看第二页| 欧美日韩久久婷婷| 欧美精品久久久久久久久25p| www.久久久久久| 超碰中文字幕在线| 伊人色综合久久久| 人妻视频一区二区| 九九视频在线免费观看| 久久精品视频国产| 国产又黄又猛又粗又爽的视频| 国产成人免费看一级大黄| 99热这里只有精品2| 五月婷婷一区二区| 中文字幕一二三四| 日韩精品国产一区二区| 五月婷婷狠狠干| 色一情一乱一乱一区91av| 亚洲av无一区二区三区| 日本特级黄色片| 少妇精品无码一区二区| 色欲av伊人久久大香线蕉影院| 人妻体内射精一区二区| 日韩一级片免费在线观看| 天天插天天操天天干| 亚洲AV无码片久久精品| а天堂中文在线资源| 一级黄色大毛片| 黄色一级视频在线观看| 欧美一区二区三区不卡视频| 无码人妻丰满熟妇精品| 天天操天天操天天操天天| 依依成人综合网| 91久久国产综合久久91| 国产一级淫片a| 久久久久亚洲AV成人无在| 日韩精品在线免费看| 无码精品人妻一区二区三区影院| 亚洲av鲁丝一区二区三区| 亚洲精品色午夜无码专区日韩| 中文人妻熟女乱又乱精品| 丰满少妇一区二区三区| 久久aaaa片一区二区| 久久视频精品在线观看| 午夜不卡福利视频| www色com| 欧美xxxx×黑人性爽| 香蕉免费毛片视频| av女人的天堂| 久久免费视频精品| 亚洲国产综合久久| 狠狠躁夜夜躁人人爽视频| 熟妇高潮一区二区| 福利网址在线观看| 少妇极品熟妇人妻无码| 一级黄色高清视频| 精品人妻少妇嫩草av无码| 亚洲a级黄色片| 好吊一区二区三区视频| 无码人妻丰满熟妇区毛片蜜桃精品| 99超碰在线观看| 日韩 欧美 中文| 国产馆在线观看| 无码人妻久久一区二区三区蜜桃| 成人高潮免费视频| 全部毛片永久免费看| av网站在线免费看| 色婷婷在线视频观看| 国产免费中文字幕| 中文字幕xxx| 国产极品美女高潮无套嗷嗷叫酒店| 欧洲美一区二区三区亚洲| aaa黄色大片| 亚洲 自拍 另类 欧美 丝袜| 国产一区二区在线视频观看 | 五月天综合激情网| 国产婷婷在线视频| 一本在线免费视频| 婷婷综合在线视频| 欧美日韩人妻精品一区二区三区| www.国产毛片| 中文字幕av久久爽一区| 日本韩国欧美中文字幕| 国产极品美女高潮无套嗷嗷叫酒店| 性欧美一区二区| 日本一区二区三区在线免费观看| 国产天堂av在线| www.午夜av| 亚洲一区二区人妻| 熟妇高潮一区二区三区| 久久久久久久久久成人| 国产免费高清av| 国产成人麻豆精品午夜在线| 91福利视频免费观看| 天天干天天曰天天操| 国产一级免费av| 国产精品自偷自拍| 国产精品免费av一区二区| 国产成人手机在线| 无码人妻aⅴ一区二区三区有奶水| 精品一区二区6| 精品国产鲁一鲁一区二区三区| 国产aⅴ爽av久久久久成人| 91久久国产综合久久91| 一本一本久久a久久| 3d动漫精品啪啪一区二区下载| 中文字幕+乱码+中文| 一区二区久久精品66国产精品| 午夜精品久久久久久久91蜜桃| 五月激情五月婷婷| 五月天婷婷网站| 中文字幕观看在线| 亚洲欧美va天堂人熟伦| 99中文字幕在线| 亚洲精品女人久久久| 69成人免费视频| www.久久网| 国产又粗又猛视频免费| 久草视频免费在线播放| 蜜桃精品一区二区| 天天干天天操天天玩| 一区二区不卡免费视频| 一级特黄a大片免费| 超碰人人人人人人人| 国产乱码久久久久久| 九九视频在线观看| 天天操天天操天天操| 中文字幕日产av| 999精品国产| 后入内射无码人妻一区| 日韩视频在线观看一区二区三区| 天天干天天做天天操| 中文字幕在线综合| 国产精品久久久久久69| 精品人妻一区二区三区香蕉| 日本黄色一区二区三区| 亚洲精品国产精品国自| 国产一区二区麻豆| 手机在线成人av| 成人精品999| 久久久午夜影院| 亚洲国产成人在线观看| 国产精品久久久国产盗摄| 久久精品亚洲天堂| 亚洲国产日韩在线一区| 极品粉嫩小仙女高潮喷水久久| 日韩一区二区a片免费观看| 亚洲乱码在线观看| 国产原创中文av| 性久久久久久久久久久| 国产九色在线播放九色| 无码人妻av一区二区三区波多野 | www.中文字幕在线观看| 国产一二三区av| 性欧美极品xxxx欧美一区二区| 国产 日韩 欧美 在线| 日本中文字幕网| 国产不卡在线观看视频| 无码人妻久久一区二区三区不卡| 成人h动漫精品一区二区无码| 日韩精品视频一区二区| 国产精品久久久午夜夜伦鲁鲁| 熟妇高潮精品一区二区三区| 国产精品嫩草av| 亚洲精品手机在线观看| 欧美一区,二区| 国产网站无遮挡| 香蕉av一区二区三区| 国产精品麻豆一区| 国产又大又黄的视频| 国产91麻豆视频| 久久久久久蜜桃一区二区| 一级片免费网站| 日本不卡一区视频| 国产精品拍拍拍| 亚洲色图久久久| 天天看片天天操| 久久久久亚洲视频| 懂色av粉嫩av蜜臀av一区二区三区 | 亚洲精品乱码久久久久久动漫| 国内精品偷拍视频| 99超碰在线观看| 中文字幕在线导航| 熟妇人妻av无码一区二区三区 | 波多野结衣国产| 天堂在线中文网| 免费国产羞羞网站视频| 国内精品国产三级国产aⅴ久| www.男人天堂| 亚洲怡红院在线观看| 人人妻人人澡人人爽久久av| 久久精品免费av| 久久久久亚洲AV成人网人人小说 |