Work @Yahoo!
Why work @Yahoo!
Being A Yahoo
Benefits
Current Openings
Being a Yahoo!
We couldn't think of a better way to describe what it means to be a Yahoo than to have some of us tell you about what they've been up to lately. So we asked Geir and Kristian to log their lives for a week.

A Software Engineer's Week
Name:Geir
Position:Software Engineer
Group:Search Core Team
Joined Yahoo!:August 2006
27.08.2007
Wrote a system test for parallel query evaluation in ruby. The purpose of the test is to verify that different queries give the same result with both term- wise and parallel evaluation. I generated some synthetic documents and defined several different queries (term, phrase, and, or, andnot, and rank) with corresponding expected result sets. Rank score is ignored for now as ranking during parallel evaluation is not yet implemented. The test was executed in the ruby test framework, and some of the Vespa programs were run under valgrind. An invalid memory read was discovered in the BitVector class, a defect that was fixed later on.
28.08.2007
Started looking at a ticket regarding a problem with bolding on special tokens. A new Vespa build was installed on a test node and the problem was recreated. The C++ code for dynamic teaser creation was inspected and I activated some debugging before building a new fsearch binary. The cause of the problem was discovered after running the same query with the new fsearch binary. A new feature request ticket was created, so the problem will be fixed at a later time.
In order to enable auto-completion when using a fsa (finite state automata) an iterator traversing all valid states from a given state is needed. I started designing a simple iterator with focus on speed and low memory consumption.
29.08.2007
We had a demo of the new parallel query evaluation for the Vespa architects. Some queries from the system test were shown and we explained what we had done this sprint in order to implement the new functionality. We also discussed shortly the overall plans for the next sprint.
Later on the following tech-talks were presented by people in the office: "Flickr architecture and Vespa performance guidelines", "The new ranking model", "The new ranking features".
Based on the iterator design from yesterday I implemented the fsa iterator in Java, and wrote some unit tests using JUnit. I usually use IntelliJ the few times I do Java programming but this time I used my favorite Vim.
30.08.2007
Fixed some things on the iterator for fsa and extended the unit tests before committing the code.
Revisited some of my tickets related to partial updates and closed the ones that was fixed. Started looking at some low priority tickets also related to partial updates.
31.08.2007
Continued looking at the low priority tickets and fixed most of them. We decided to deprecate attribute:unsigned in search definition, so I made the necessary corrections to the Java code in the searchdefinition module and updated the Vespa documentation.
Earlier I optimized range search and prefix search on attributes using posting lists, but in order to have a stable worst case search time a fallback to filtering is necessary when the number of hits gets large. The time complexity of the two approaches can be expressed in terms of the number of hits and the number of values to search through. I ran a benchmark program for attributes in order to decide the constants. Based on the time complexity expressions and the calculated constant a strategy with fallback to filtering was implemented in C++ for both range search and prefix search. The approximated cost for the search using posting lists and filtering is calculated and the cheapest strategy is chosen.

An Engineering Manager On The Road
Name:Kristian
Position:Engineering Manager
Group:Storage
Joined Yahoo!:Dawn of time (i.e. Overture acquisition in 2003)

Notes from the 2007 Tech Expo


13.09.2007:
General
  • last day, left at noon for the Airport, uneventful flight back home (although a new record, 13.5 hrs SFO-TRD is not bad!)
  • Got a lot of swags for Karrieredagen from Fran - one bag overweight is $110 at KLM ...
  • Met with Jaspal, Mail architect, brief on the homestore project and Vespa team's plans for Mail going forwards (focus on Mail Search)
  • Met with David, talked about Music's new service and use of VDS and Vespa-Search
  • Met with Patrick, next steps for Image Search, API integration when Håkon and Einar is over in October


12.09.2007:
General
  • Attended most of the Vespa sessions, and YDHT (interesting)
  • Met with Grace to discuss next steps for Mail Search
    • migration tool: kristian, next sprint proposal. phase one: evaluate vespa sizing: kristian phase two: do more tests if needed
    • compression: kristian check with Henning
    • integration: Kristian send new proposal, then make travel plans, checksum proposal
    • slow searches for large users: grace
    • dinner with Grace, Jason, Nadia, Yngve and Henning at Nola's in Palo Alto afterwards (Jason is VP, interesting to get his opinions on Mail Search next steps)

17:00 A4 Tortola - VDS and Elmo Studio


11.09.2007:
General
  • Attended the Vespa-C and MyBlogLog sessions
  • prepared for my presentation, and many follow-ups after it (questions etc)

14:00 Classroom 4 Building C SNV - Vespa Document Storage (VDS): Demo and Use Cases

17:00 - Meet Yi-Kai
  • Taiwan team needs a way to set up something that can feed from storage to search (i.e. subscriptions)


10.09.2007:
General
  • Attended most of the Vespa sessions

14:00 - Mail Search capacity planning


07.09.2007:
General
  • headed to San Francisco for the weekend, drove after 19:00 to avoid traffic
  • worked on the mail search systems in the weekend to fix the 8M users problem on vsm nodes

14:00 Tex / Takuya / Grace
  • We discussed the Japanese linguistics requirements
  • Grace/Mail team to decide priorities

17:00 A3 Death Valley - Meeting with Jason / Grace
  • See Grace's notes in a separate mail
  • we can continue the 3.x work but must also address some short term issues


06.09.2007:
11:00 Worked with Quoc
  • on VDS install / 64 bit packages for Vespa-C
  • factory introduction
  • lunch at PhoNam - authentic Vietnamese food!

14:00 The Witch - VDS and Studio
  • Xuejun / Studio team. Studio team would like to learn from Kristian about:
  • what's the best way to use VDS
  • how to do bcp/fail over with the current VDS
  • discuss about the future plans and requirements about VDS

16:00 Met Grace and Will
  • Mail search prod issues
  • prepare for meeting with Jason, which was moved to tomorrow


05.09.2007:
10:00 The Witch - Meet Eric to plan the October activities

12:30 Santa Clara GA - Lunch with Lev Stesin
  • Maps might want to try VDS for smaller installations (use ymdb for US today)

13:00 Met Will (Mail Search Ops)
  • various Mail Search production issues
  • intro to new stuff

16:00 Met Grace to continue the Mail Search integration planning
  • discussed trends / growth in clusters
  • looked into why some queries are slow in MUD1 - possibly large user files


04.09.2007:
General
  • I met Balu - he has been involved with Vertex, and can be our 'contact' person in Vespa-C team on Vertex. We did not plan follow-up meetings.
  • I met a lot of new people (Vespa-C, Qualis, Vespa-SE, Vespa-Studio) and set up meetings later this week on various subjects
  • Dinner with Grace at Satsuma, Mountain View

13:00 2MC3 Mavericks - Biweekly Vespa engineering discussion
  • Bi-weekly engineering meeting

14:00 Met Dheeren in SE team
  • Discussed Vespa Systems Engineering
  • Dheeren is managing the SE team in SNV
  • We discussed the Vespa components on a higher level (how they work), and the importance of the SE team being Vespa experts

16:00 - Met Grace for Mail Search discussions
  • Discussed production problems / what is the expected data size in RE22
  • Discussed Mail integration with Vespa 3.x
    • Mail has a plan to move to Linux
    • There are alternative ways to do the delivery/vespagrim process, so we can overcome the DocAPI linking problem - to be explored in more detail


03.09.2007:
General

Yahoo! is committed to equal opportunity. In that spirit, we welcome your interest in our employment opportunities.

Copyright © 2007 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Copyright/IP Policy - Help