精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕

CS5012代做、代寫Python設(shè)計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計程序
  • 下一篇:AcF633代做、Python設(shè)計編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號-3 公安備 42010502001045

    精品深夜AV无码一区二区_伊人久久无码中文字幕_午夜无码伦费影视在线观看_伊人久久无码精品中文字幕
    <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
    <ul id="e4iaa"></ul>
    <blockquote id="e4iaa"><tfoot id="e4iaa"></tfoot></blockquote>
    • <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp>
      <ul id="e4iaa"></ul>
      <samp id="e4iaa"><tbody id="e4iaa"></tbody></samp><ul id="e4iaa"></ul>
      <ul id="e4iaa"></ul>
      <th id="e4iaa"><menu id="e4iaa"></menu></th>
      欧美一区二区观看视频| 欧美色综合天天久久综合精品| 欧美一区二区三区爱爱| 国产精品综合一区二区三区| 国产精品丝袜久久久久久app| 色婷婷一区二区| 免费成人av在线| 中文字幕不卡在线| 欧美二区三区的天堂| 国产福利一区二区| 香蕉成人伊视频在线观看| 久久精品一区蜜桃臀影院| 在线观看亚洲专区| 国产精品77777| 国产成人精品一区二区三区四区 | 欧洲精品一区二区| 色八戒一区二区三区| 91精品欧美久久久久久动漫| 久久美女高清视频| 色综合天天做天天爱| 久久99久久精品| 一区二区三区电影在线播| 久久久亚洲欧洲日产国码αv| 久久久久国产精品麻豆ai换脸| 国产精品沙发午睡系列990531| 中文字幕一区二区三区在线不卡| 精品国产精品一区二区夜夜嗨| 欧美色偷偷大香| 色呦呦一区二区三区| 欧美在线三级电影| 精品国产乱码久久| 亚洲色图欧美激情| 中文字幕一区二区三区色视频| 一区二区不卡在线播放| 天天av天天翘天天综合网色鬼国产| 亚洲素人一区二区| 久久成人18免费观看| 日产国产高清一区二区三区| 日本不卡一区二区三区高清视频| 国产伦精一区二区三区| 狠狠色丁香久久婷婷综| 久久精品国产免费| 色综合久久中文综合久久牛| 欧美mv日韩mv| 久久综合网色—综合色88| 精品美女被调教视频大全网站| 1024成人网色www| 精品无码三级在线观看视频| 一本色道久久综合亚洲aⅴ蜜桃 | 不卡免费追剧大全电视剧网站| 国产揄拍国内精品对白| 91福利视频久久久久| 日韩欧美电影一区| 久久久高清一区二区三区| 亚洲v日本v欧美v久久精品| 成人看片黄a免费看在线| 成人av综合一区| 日韩欧美成人一区二区| 亚洲成a人在线观看| 色综合久久综合网| 国产精品久久久久久久裸模| 亚洲精品福利视频网站| 天天综合天天综合色| 99久久精品免费观看| 欧美日韩视频不卡| 亚洲人被黑人高潮完整版| 亚洲v中文字幕| 91亚洲永久精品| 日韩无一区二区| 国产农村妇女精品| 亚洲国产一区在线观看| 蜜桃视频在线一区| 成年人午夜久久久| 中文字幕av资源一区| 国产成人av影院| 日本一区二区三区dvd视频在线 | 国产偷国产偷精品高清尤物| 麻豆精品一二三| 日韩精品一区二区三区三区免费| 日本女优在线视频一区二区| 欧美高清一级片在线| 日本不卡的三区四区五区| 日韩视频在线永久播放| 免费看欧美女人艹b| 欧美va亚洲va在线观看蝴蝶网| 裸体歌舞表演一区二区| 日韩精品一区二区在线| 国产麻豆91精品| 亚洲欧美综合在线精品| 色视频欧美一区二区三区| 香蕉久久夜色精品国产使用方法| 在线综合+亚洲+欧美中文字幕| 美女视频一区二区| 国产欧美日韩视频在线观看| 99re热这里只有精品视频| 亚洲永久精品国产| 99久久夜色精品国产网站| 亚洲婷婷综合色高清在线| 欧美性大战久久| 免费高清在线一区| 国产精品色眯眯| 欧美少妇性性性| 国产精品一区在线观看你懂的| 国产精品女主播在线观看| 欧美体内she精高潮| 久久99精品国产.久久久久| 国产精品毛片久久久久久| 欧美午夜不卡视频| 国产精品一线二线三线| 亚洲一区二区三区四区的| 欧美tickling挠脚心丨vk| 色综合欧美在线| 激情综合色丁香一区二区| 日韩美女久久久| 26uuu亚洲| 欧美优质美女网站| 国产成人午夜精品5599| 亚洲午夜成aⅴ人片| 国产欧美在线观看一区| 69久久夜色精品国产69蝌蚪网| 成人高清免费观看| 日本伊人午夜精品| 亚洲欧美日韩久久| 久久久久久亚洲综合| 欧美日本一区二区在线观看| 亚洲成人动漫在线观看| 国产日韩欧美综合一区| 欧美精品粉嫩高潮一区二区| av日韩在线网站| 亚洲动漫第一页| 日韩一区有码在线| 久久免费电影网| 日韩欧美二区三区| 在线播放一区二区三区| 一本一道久久a久久精品| 国产成人av一区二区三区在线| 日本欧美在线观看| 亚洲一区中文在线| 亚洲视频图片小说| 亚洲色图一区二区| 欧美国产日韩a欧美在线观看| 精品国产sm最大网站免费看| 欧美男生操女生| 欧美三级一区二区| 色婷婷综合久久| 在线精品视频免费播放| 成人亚洲一区二区一| 国产成人精品免费一区二区| 国产精品一区二区你懂的| 国产一区二区在线免费观看| 久久成人免费电影| 激情综合色播五月| 激情文学综合网| 国产成人亚洲综合a∨猫咪| 国产自产高清不卡| 国产一区二区三区免费| 国产乱人伦偷精品视频不卡| 国产尤物一区二区| 国产91精品久久久久久久网曝门| 亚洲一区二区av在线| 洋洋av久久久久久久一区| 亚洲一区二区四区蜜桃| 午夜久久久久久电影| 青青草国产精品97视觉盛宴| 久久国产精品一区二区| 狂野欧美性猛交blacked| 国产一级精品在线| 国产99久久久国产精品| aaa亚洲精品一二三区| 91麻豆免费在线观看| 欧美日韩一区二区在线视频| 日韩欧美中文字幕精品| 欧美国产在线观看| 国产精品成人免费| 三级不卡在线观看| 亚洲国产精品一区二区www| 首页亚洲欧美制服丝腿| 国产精品综合网| 欧美在线你懂的| 精品精品国产高清a毛片牛牛| 欧美激情一二三区| 亚洲成a人在线观看| 国产精品1区2区3区| 色婷婷激情一区二区三区| 欧美一区二区三区日韩视频| 久久九九国产精品| 一区二区三区精品在线观看| 精品在线你懂的| 欧美综合一区二区| 精品国产三级a在线观看| 一区二区视频在线| 国内精品在线播放| 欧洲国产伦久久久久久久| 久久久www成人免费无遮挡大片| 亚洲欧美国产三级| 国产一区二区导航在线播放| 欧美日韩一区二区在线观看| 国产午夜亚洲精品理论片色戒| 亚洲成国产人片在线观看| 成人黄色a**站在线观看|