Observation of bacteria using staining procedures Simple staining Gram staining.
NEr using N-Gram techniqueppt
-
Upload
gyandeep-kansal -
Category
Documents
-
view
21 -
download
3
Transcript of NEr using N-Gram techniqueppt
AN APPLICATION FOR NER USING
N-GRAM TECHNIQUE
Made By:
Roopali Sethi (9911103534)F-2
What is NER??
Name Entity Recognition (NER) is an information extraction task that is concerned with the recognition and classification of name entity from free text. Name entities classes are, for instance, location, person named, organization named, dates and money amounts.
Why to prefer N-Gram Technique for NER ??
This Application is better in various aspects :-
=> Provides interactive U.IÞ user friendlinessÞ As it is an easy to use program thus is quite time saving
also Þ It has all Deployable functionalities
Functionalities!!
The following diagram explains the interconnectivity of the modules and their working.
Selection of Data Set
Applying Algorithm
Identify and Classify NE’s
Display Result
The main functions the product must perform or must let the user perform
1: User Self Service User self-service is a subset within the knowledge management software category and which contains a range of software that specializes in the way information, process rules and logic are collected and accessed through support interviews. This software allows people to secure answers to their inquiries and /or needs through an automated interview fashion instead of traditional search approaches.
2: Work Flow A workflow consists of an orchestrated and repeatable pattern of business activity enabled by the systematic organization of resources into processes that transform materials, provide services or process information. It can be depicted as a sequence of operations, declared as work of a person or group and organization of staff, or one or more simple or complex mechanisms.
3 : Reporting and Diagrammatic Representation With this approach to the articles in Communications, we better understand the culture, identity and evolution of computing. With a view toward portraying its value for institutional – identity data mining, we present several findings that emerged from our N-Gram analysis.
4 : Extensibility It is a software design principle defined as a system’s ability to have new functionality extended, in which the system’s internal structure and data flow are minimally or nor not affected, particularly that recompiling or changing the original source code is unnecessary when changing a system’s behavior, either by the creator or other programmers.
5: Application Interface- An application interface specifies a component in terms of its operations, their inputs and outputs and underlying types. Its main purpose is to define a set of functionalities that are independent of their respective implementation, allowing both definition and implementation to vary without compromising each other.
Plan Of Action
1. Design U.I
2. Analysation
3. Implementation
4. Testing
5. Output Displayed
Summary of Research Paper
A new name entity class extraction method based on association rules have been presented. Comparing the method with maximum entropy method. In the English corpus, under the appropriate combination of types of rules it is possible to improve the recall so that the association rule method is strictly more effective that the maximum entropy i.e. this result makes our method particularly suitable for tasks whose requirements emphasize the quality rather than the quantity of results.
Summary Cont.
String Match Algorithm means scanning of one or more generally, all the occurrences of a search string in a given text. This paper helped to introduce a fast string match algorithm in order to detect the exact and like occurrences of the given pattern within input string. In this paper , the sum of character’s value of the string that needs to scanned has been compared with the sum of the corresponding values in the sliding window , from the experimental results it will be concluded that novel algorithm is more efficient than BM in many times, also the longer the pattern , the bigger the performance improved.
Algorithm
Exact String Match Algorithm
Exact String Match Algorithm also called as called as string search algorithm is an algorithm where we can find a place where one or several patterns or strings are found within a larger string or text i.e. String matching consists of at least one or may more than one occurrence of a string or pattern in a text. The strings considered are sequence of symbols, and the symbols are defined by an alphabet. The size and the other features of alphabet are important factors in designing of an algorithm.
Working of Algorithm
The text is scanned with the help of a window whose is equal to m. Firstly, the left end of the window and the text is aligned, and then
the characters of the window were compared with the character of the pattern, generally called as attempt.
Then after the whole match or mismatch of the pattern, window is shifted to the right.
The whole procedure is repeated until the right end of the window goes beyond the right end of the text.
This mechanism is nothing but the sliding window mechanism, where each attempt with position j in the text when the window is positioned on y[j…j+m-1].
Pseudo Code for i := 0 to n-1 { for j := 0 to m-1 { if P[j] <> T[i+j] then break } if j = m then return i}
This pseudo code shifts along by one by one and tries to compare corresponding character
Tools Used for Experimentation
Visual Studio Sql Server . Net
Implementation Using Visual studio, sql server and .Net organizations can bring the functionality for
users to find the useful and interesting results from the last days article .
Dot Net will be used to create the front-end and application
interface that will be used by the user to access multiple
functionalities. This ensures that best graphical layout and
much more user friendly web page. We will create pages in dot net
which will have different pages for modular functions. Sql Server
will be used as the core backend and the database is stored in the
form of file in the system. Visual Studio will be used as the tool to
compile java programs. The algorithms and modification in the
pre- written VS toolkit code will be done in dot net.
The applications will ask users to proceed and select a feature to
perform action and the methods and algorithms will generate
results for the user.
Working of Program
Findings
After successful execution of project, I found that this project can be used for classification of entities from free text to make the work of user easily. Also it has been observed that the tool will not work properly in case of redundant data i.e. when we were trying to classify for money entity and we wished to match for the string ‘money’ the tool was unable to display the correct output.
Conclusion
This report has looked in detail at the major techniques used for String match in any given text Section I gave an overview of name entity recognition and in particular the basic introduction about the Document. Section II describes in detail, various String Matching algorithms which are mandatory to make this project a success. Then Section III there is an overview about the functional requirements and Diagrams making it easy for the reader to understand the working of this project. Section IV focuses on the test planning and implementation tools and Thus a NER using N-gram tool is ready.
THANK YOU!!