如何使用斯坦福pos tagger进行词性标注转
答案:2 悬赏:40
解决时间 2021-03-06 18:14
- 提问者网友:无悔疯狂
- 2021-03-05 23:54
如何使用斯坦福pos tagger进行词性标注转
最佳答案
- 二级知识专家网友:寂寞的炫耀
- 2021-03-06 00:29
搜一下:如何使用斯坦福pos tagger进行词性标注转
全部回答
- 1楼网友:有钳、任性
- 2021-03-06 00:53
Tagging text with Stanford POS Tagger in Java Applications
78 Replies
I was looking for a way to extract “Nouns” from a set of strings in Java and I found, using Google, the amazing stanford NLP (Natural Language Processing) Group POS.
The library provided lets you “tag” the words in your string. That is, for each word, the “tagger” gets whether it’s a noun, a verb ..etc. and then assigns the result to the word. For example:
“This is a sample sentence”
will be output as
?
1
This/DT
is/VBZ a/DT sample/NN sentence/NN
To do this, the tagger has to load a “trained” file that contains the necessary information for the tagger to tag the string. This “trained” file is called a model and has the extension “.tagger”. There are several trained models provided by Stanford NLP group for different languages.
In this post I will show you how to use such library in your Java application using Eclipse IDE.
Create a new project.
Create a new folder called “taggers”.
Download the zip file provided by stanford group.
Extract the zip file and Open the extracted folder.
You will find a folder called models, open it and copy the model you want to the “taggers” folder we created earlier + its corresponding (with the same name) “.props” file.
Now we need to import the library to our project so that Eclipse does not complain when we use it in our code. So, right click your project > Build Path > Configure Build Path.
In the new window, Open the libraries tab (from the top) and click the Add External Jars button.
Locate the “stanford-postagger.jar” file that is found in the extracted folder.
Now enough with the configuration and let’s start coding. In your project create a new Class and in its main method write:
?
1
2
3
4
5
//
Initialize the tagger
MaxentTagger
tagger = newMaxentTagger(
"taggers/left3words-distsim-wsj-0-18.tagger");
The MaxentTagger constructor takes the path to the model (trained file) as a parameter:
“NAME_OF_FOLDER/NAME_OF_MODEL.tagger”.
Once you write the code, Eclipse will tell you to import the MaxentTagger and inform you that it throws some exceptions. Use eclipse to add all that to the code.
Finally, we tag the string we want:
?
01
02
03
04
05
06
07
08
09
10
11
//
The sample string
String
sample = "This
is a sample text";
//
The tagged string
String
tagged = tagger.tagString(sample);
//
Output the result
System.out.println(tagged);
我要举报
如以上问答内容为低俗、色情、不良、暴力、侵权、涉及违法等信息,可以点下面链接进行举报!
大家都在看
推荐资讯