Apache Lucene!
Information extraction
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video could be seen as information extraction. Due to the difficulty of the problem, current approaches to IE focus on narrowly restricted domains. An example is the extraction from news wire reports of corporate mergers, such as denoted by the formal relation: from an online news sentence such as: "Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp." A broad goal of IE is to allow computation to be done on the previously unstructured data. History[edit] Beginning in 1987, IE was spurred by a series of Message Understanding Conferences. Present significance[edit] Lists
More Data Beats Better Algorithms -- Or Does It? - Omar Tawakol - Voices
Anand Rajaraman from Walmart Labs had a great post four years ago on why more data usually beats better algorithms. He cited a competition modeled after the Netflix challenge, in which he had his Stanford Data Mining students compete to produce better recommendations based on a data set of 18,000 movies. It turned out that the winning team had a very rudimentary algorithm but won because it appended data about the movies from outside the original data set (they used IMDb). According to this line of thinking, Google proved this same lesson years ago when it showed that PageRank could outperform keyword extraction techniques (used by other search engines at that time) by leveraging data from outside the page itself (i.e., the votes that page creators made by choosing their outbound links, which defined the network topology of the Web). Is this love of more data really well-founded? This leads us to our central point. This brings us back to the original question.
groupe de recherche sur l'information retrivial at microsoft
Overview The Adaptive Systems & Interaction group (ASI) pursues research on automated reasoning, adaptation, and human-computer interaction. Interests of the group include principles and applications of decision making and learning, computation in the face of complexity, techniques for information management and search, and the development and evaluation of innovative designs for visualization and interaction. Research goals include both the pursuit of basic science and the development of computing and communications applications that demonstrate new functionalities and flexibility. ASI is at the center of user modeling at Microsoft Research, focused on inferring the goals and needs of users from multiple sources of information about activity and interests. The group is also home to research on information retrieval and management, including work in automated text classification and clustering. Areas of Focus User Modeling and Intelligent User Interfaces. HCI Design and Visualization.
API
In computer programming, an application programming interface (API) specifies how some software components should interact with each other. Detailed explanation[edit] API in procedural languages[edit] In most procedural languages, an API specifies a set of functions or routines that accomplish a specific task or are allowed to interact with a specific software component. The Unix command man 3 sqrt presents the signature of the function sqrt in the form: SYNOPSIS #include <math.h> double sqrt(double X); float sqrtf(float X); DESCRIPTION sqrt computes the positive square root of the argument. ... This description means that sqrt() function returns the square root of a positive floating point number (single or double precision), as another floating point number. Hence the API in this case can be interpreted as the collection of the include files used by a program, written in the C language, to reference that library function, and its human readable description provided by the man pages.
NLP Systems & Applications: Knowledge Base Population — Ling573, Spring Qtr. 2010
Course description This course examines building coherent systems to handle practical applications. Particular topics vary. This term we will be focusing on question-answering. Course Resources Textbook There is no required textbook for this course. A number of published research articles will also provide background for the course. Prerequisites: Ling 570, Ling 571, Ling 572 CS 326 (Data Structures) or equivalent Stat 391 (Probability and Statistics for CS) or equivalent Formal grammars, languages, and automata Programming in one or more of Java, Python, C/C++, or Perl Linux/Unix commands Grading 90%: Project Deliverables and Presentations 10%: Class participation Course Mechanics Additional detailed information on grading, collaboration, incompletes, etc. Tentative schedule, subject to change without notice.
Environmental Energy Technologies Division
Keeping Found Things Found [KFTF]
Health Care
Health care (or healthcare) is the diagnosis, treatment, and prevention of disease, illness, injury, and other physical and mental impairments in humans. Health care is delivered by practitioners in allied health, dentistry, midwifery-obstetrics , medicine, nursing, optometry, pharmacy and other care providers. It refers to the work done in providing primary care, secondary care, and tertiary care, as well as in public health. Access to health care varies across countries, groups, and individuals, largely influenced by social and economic conditions as well as the health policies in place. Countries and jurisdictions have different policies and plans in relation to the personal and population-based health care goals within their societies. Health care systems are organizations established to meet the health needs of target populations. Health care can contribute to a significant part of a country's economy. Health care delivery[edit] Primary care[edit] Secondary care[edit]