|
Why work @Yahoo! Being A Yahoo Benefits Current Openings |
Being a Yahoo!
We couldn't think of a better way to describe what it means to be a Yahoo than
to have some of us tell you about what they've been up to lately. So we asked
Geir and Kristian to log their
lives for a week.
27.08.2007
Wrote a system test for parallel query evaluation in ruby. The purpose of the test is to verify that different queries give the same result with both term- wise and parallel evaluation. I generated some synthetic documents and defined several different queries (term, phrase, and, or, andnot, and rank) with corresponding expected result sets. Rank score is ignored for now as ranking during parallel evaluation is not yet implemented. The test was executed in the ruby test framework, and some of the Vespa programs were run under valgrind. An invalid memory read was discovered in the BitVector class, a defect that was fixed later on.
28.08.2007
Started looking at a ticket regarding a problem with bolding on special tokens. A new Vespa build was installed on a test node and the problem was recreated. The C++ code for dynamic teaser creation was inspected and I activated some debugging before building a new fsearch binary. The cause of the problem was discovered after running the same query with the new fsearch binary. A new feature request ticket was created, so the problem will be fixed at a later time.
In order to enable auto-completion when using a fsa (finite state automata) an
iterator traversing all valid states from a given state is needed. I started
designing a simple iterator with focus on speed and low memory consumption.
29.08.2007
We had a demo of the new parallel query evaluation for the Vespa architects. Some queries from the system test were shown and we explained what we had done this sprint in order to implement the new functionality. We also discussed shortly the overall plans for the next sprint.
Later on the following tech-talks were presented by people in the office:
"Flickr architecture and Vespa performance guidelines", "The new ranking model",
"The new ranking features".
Based on the iterator design from yesterday I implemented the fsa iterator in
Java, and wrote some unit tests using JUnit. I usually use IntelliJ the few
times I do Java programming but this time I used my favorite Vim.
30.08.2007
Fixed some things on the iterator for fsa and extended the unit tests before committing the code.
Revisited some of my tickets related to partial updates and closed the ones that
was fixed. Started looking at some low priority tickets also related to partial
updates.
31.08.2007
Continued looking at the low priority tickets and fixed most of them. We decided to deprecate attribute:unsigned in search definition, so I made the necessary corrections to the Java code in the searchdefinition module and updated the Vespa documentation.
Earlier I optimized range search and prefix search on attributes using posting
lists, but in order to have a stable worst case search time a fallback to
filtering is necessary when the number of hits gets large. The time complexity
of the two approaches can be expressed in terms of the number of hits and the
number of values to search through. I ran a benchmark program for attributes in
order to decide the constants. Based on the time complexity expressions and the
calculated constant a strategy with fallback to filtering was implemented in C++
for both range search and prefix search. The approximated cost for the search
using posting lists and filtering is calculated and the cheapest strategy is
chosen.
Notes from the 2007 Tech Expo13.09.2007: General
12.09.2007: General
17:00 A4 Tortola - VDS and Elmo Studio 11.09.2007: General
14:00 Classroom 4 Building C SNV - Vespa Document Storage (VDS): Demo and Use Cases
17:00 - Meet Yi-Kai
10.09.2007: General
14:00 - Mail Search capacity planning 07.09.2007: General
14:00 Tex / Takuya / Grace
17:00 A3 Death Valley - Meeting with Jason / Grace
06.09.2007: 11:00 Worked with Quoc
14:00 The Witch - VDS and Studio
16:00 Met Grace and Will
05.09.2007: 10:00 The Witch - Meet Eric to plan the October activities 12:30 Santa Clara GA - Lunch with Lev Stesin
13:00 Met Will (Mail Search Ops)
16:00 Met Grace to continue the Mail Search integration planning
04.09.2007: General
13:00 2MC3 Mavericks - Biweekly Vespa engineering discussion
14:00 Met Dheeren in SE team
16:00 - Met Grace for Mail Search discussions
03.09.2007: General
|
Yahoo! is committed to equal opportunity. In that spirit, we welcome your interest in our employment opportunities.
Copyright © 2007 Yahoo! Inc. All rights reserved.