精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕

CS5012代做、代寫Python設(shè)計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計程序
  • 下一篇:AcF633代做、Python設(shè)計編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區(qū)
    昆明西山國家級風景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號-3 公安備 42010502001045

    精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕
    <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
    <ul id="e4iaa"></ul>
    <blockquote id="e4iaa"><tfoot id="e4iaa"></tfoot></blockquote>
    • <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
      <ul id="e4iaa"></ul>
      <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp><ul id="e4iaa"></ul>
      <ul id="e4iaa"></ul>
      <th id="e4iaa"><menu id="e4iaa"></menu></th>
      国产免费一级视频| 依依成人综合网| 久久久午夜影院| 99久久久免费精品| 五月天激情开心网| 免费在线视频一区二区| www.欧美com| 亚洲一级在线播放| 中文字幕日本人妻久久久免费 | 成人国产精品久久久网站| 婷婷在线观看视频| 日韩不卡的av| 色欲久久久天天天综合网| 蜜臀尤物一区二区三区直播| 国产视频第一页| 成人午夜精品福利免费| 一出一进一爽一粗一大视频| 最近中文字幕av| 亚洲免费视频网| av在线网站观看| 国产精品jizz| 国产精品999在线观看| 99久久精品国产一区色| 一级特黄aaa大片| 一本色道久久亚洲综合精品蜜桃| 亚洲福利精品视频| 午夜性福利视频| 在线播放成人av| 亚洲一级理论片| 91精产国品一二三| 99精品免费观看| 丰满人妻一区二区三区四区 | 欧美不卡视频在线观看| 久久午夜福利电影| 国产在线观看99| 九九视频免费看| 久久久www成人免费毛片| 欧美黄色免费在线观看| 五月天av在线播放| 777一区二区| 狠狠综合久久av一区二区 | 成人乱码一区二区三区| 丰满岳乱妇一区二区| 国产精品人妻一区二区三区| 精品人妻午夜一区二区三区四区| 女人18毛片毛片毛片毛片区二| 日韩三级一区二区| 中文字幕成人动漫| 国产成人久久久久| 久久亚洲AV无码| 一区二区三区在线观看av| 不卡的免费av| 久久久一区二区三区四区| 人妻 日韩 欧美 综合 制服| 中文字幕人妻精品一区| 国产区在线观看视频| 色一情一乱一乱一区91av| 亚洲精品20p| 激情综合网五月婷婷| 无码人妻精品一区二区三区温州 | 黄色激情视频在线观看| 天天操天天爽天天干| www.天堂av.com| 日韩欧美国产片| www.午夜激情| 日韩精品视频网址| 国产3级在线观看| 天天操天天爽天天干| 岛国av免费观看| 四虎成人免费视频| 国产日韩欧美久久| 中文字幕人妻无码系列第三区| 国产精品久久777777换脸| 四虎精品免费视频| 国产女人18毛片水真多18| 熟妇女人妻丰满少妇中文字幕| 国产精品人妻一区二区三区| 色综合天天色综合| 国产亚洲色婷婷久久99精品91| 无码精品人妻一区二区| 黄色av网站免费| 草久久免费视频| 中文字幕人妻一区二区三区在线视频| 国产午夜久久久| 亚洲国产精品久久久久久久| 久久久久久久久久综合| 成年人小视频在线观看| 无码人妻一区二区三区线| 精品一区免费观看| 丰满少妇乱子伦精品看片| 熟妇人妻无乱码中文字幕真矢织江| 国产18精品乱码免费看| 亚洲精品在线观看av| 人妻人人澡人人添人人爽| 国产麻豆剧传媒精品国产| 97人妻人人揉人人躁人人| 日韩在线一卡二卡| 男人天堂av在线播放| 国产美女久久久久久| 97人妻精品一区二区三区软件 | 亚洲视频在线观看一区二区三区| 国产美女福利视频| 亚洲熟妇无码av| 中文字幕一区二区人妻视频| 日本一区二区三区久久| 免费麻豆国产一区二区三区四区| 丰满少妇高潮在线观看| 丰满人妻av一区二区三区| www.五月婷| 国产精品999.| 国产黑丝一区二区| 国产免费www| 国内精品卡一卡二卡三| 国产又黄又粗的视频| 精品91久久久| 激情综合丁香五月| 久久精品久久国产| 欧美日韩一二三四区| 日本美女一级视频| 亚欧精品视频一区二区三区 | 久久久久久国产免费a片| 精品夜夜澡人妻无码av| 老牛影视av老牛影视av| 欧美亚洲日本在线| 午夜精品久久久久久久99| 亚洲第一区第二区第三区| 亚洲熟女www一区二区三区| 一级黄色av片| 国产成人久久久久| 国产精品国产三级国产aⅴ | 亚洲色图欧美视频| 91影院在线播放| 国产探花精品一区二区| 久热这里只有精品在线| 天堂在线观看av| 一级黄色特级片| 国产小视频你懂的| 欧美精品久久久久久久久46p| 日韩久久中文字幕| 一区精品在线观看| 国产成人免费观看网站| 男人天堂视频网| 中文字幕亚洲欧美日韩| 国产一级片视频| 天天插天天射天天干| 91欧美视频在线| 精品人妻一区二区三区麻豆91 | av电影中文字幕| 精品亚洲永久免费| 日韩永久免费视频| 99视频只有精品| 欧美三级小视频| 97av免费视频| 日韩av在线天堂| 97精品人妻一区二区三区香蕉| 韩国无码一区二区三区精品| 人妻 日韩 欧美 综合 制服| 亚洲欧洲视频在线观看| 国语对白一区二区| 一区二区视频在线免费观看| 国产又粗又猛又黄又爽无遮挡| 天天做天天爱夜夜爽| 狠狠躁日日躁夜夜躁av| 亚洲黄色在线播放| 美女福利视频网| 99成人在线观看| 五月天婷婷导航| 久久久久国产一区| wwwxxx亚洲| 中文字幕 日韩 欧美| 加勒比在线一区| 中文字幕 日韩有码| 欧美手机在线观看| 国产一区二区播放| 夜夜躁很很躁日日躁麻豆| 少妇久久久久久被弄高潮| 国产污污视频在线观看| 亚洲综合中文网| 日韩一级片在线免费观看| 好吊色一区二区| 国产1区在线观看| 亚洲免费av一区二区三区| 人妻aⅴ无码一区二区三区| 国内精品国产三级国产aⅴ久| 91丝袜在线观看| 在线播放国产一区| 欧美特黄一级视频| 久视频在线观看| 九九九免费视频| 精品久久久久久亚洲综合网站| 97国产精品久久久| 中文字幕亚洲影院| 亚洲va综合va国产va中文| 欧美成人三级在线播放| 精品国产亚洲av麻豆| 韩国一区二区三区四区| 国产精品99精品| 国产视频久久久久久| 国产精品999在线观看| 成人午夜福利一区二区|