Think Tank
In addition to the exciting, longer term Key Scientific Challenges that Yahoo! Labs is tackling, we’re also presented with some interesting technical and business challenges that our core engineering and product teams address on a daily basis. These are complex challenges that only the largest Internet technology companies in the world are facing - from building, operating and improving massive datacenters, to making user generated publishing networks possible at scale, and everything in between.
We are constantly innovating in each of these areas, and we believe that collaboration with universities is a very important part of that process. We also understand that at times it can be difficult for academia to stay in touch with and on top of the fast paced, ever-changing needs of industry. That’s why we are sharing this list of technical and business problems, roadblocks and future challenges in an effort to give the next generation of Internet engineers, designers and entrepreneurs access to real world, large-scale problems and information.
This page is an open call to students, faculty and others in the academic community to join us in thinking about and solving some of the most exciting challenges facing our industry today. This is a new program designed to complement our KSC Program and offer faculty and students the unique opportunity to collaborate with our engineers and business leaders in solving these exciting problems. The collaboration could take the form of an internship, an externship, business plan, case study, course project, even a formally funded research project. If you are interested in working on these problems in any capacity, please send an email to the alias listed below the relevant challenge area with your proposal.
The bottom line is that we’re excited to hear your ideas and find interesting ways to tackle these problems together. We hope you’ll join us in thinking about these challenges as the solutions have the potential to impact the future of the Internet for billions of users around the world.
Challenges
Accessibility
- Improving the web for people with cognitive and learning disabilities: With the introduction of web 2.0 and more dynamic web sites, people with cognitive and learning disabilities face an ever-increasing complexity of content, crowded layouts and confusing interactions. Cryptically-written content (news headlines, articles, blog entries), inconsistently-designed widgets (drop-down menus, slideshows) are just some of the challenges that these users face when browsing the Internet of today.
- How can we evaluate articles at feed ingestion time and create a simplified version for those with cognitive disabilities? This could include a specialized summary, list of included links at the footer, replacement of difficult wording with more simplified descriptions, etc.
- How can a web site’s content be dynamically transformed (simplified or enhanced) depending on the user’s cognitive level?
- Determining user’s disability via machine learning: While many in the Internet industry strive for an inclusive web and designing around web standards, we realize that it is nearly impossible to design a web site that is universally-accessible to everyone. At some point, certain features for one type of a user (eg with a learning disability) will conflict with features for another user (e.g. those who use screen reading software). It is therefore important to give users an ability to set their preferences based on how they interact with the computer. Alternatively, the web site could be designed to dynamically change in response to user’s behavior (e.g. shaky mouse movements, keyboard presses etc).
- How can an intelligent algorithm detect the type of the user on the web site?
- How much data is needed in order to make conclusions about user’s behavior before dynamically transforming the web site in response to their actions?
- Augmented reality to help blind users to learn about their surroundings: Mobile phones and portable GPS devices have drastically changed the way blind people travel and learn about their surroundings on the go. Despite all the technological advancements, however, there are still a number of problems preventing blind users from being able to completely rely on technology, particularly when navigating within narrow or constrained spaces.
One area that has been gaining ground recently is augmented reality which helps the user to visualize physically-inaccessible objects. In particular, a combination of maps and photographs is used to give the user an idea of the place they are approaching, for example.
- Can the augmented reality be extended to include sound, haptic feedback and/or engage human senses other than vision to provide the similar experience for blind people?
- Are there, if any, limits to the use of multi-sense approach where the “reality” becomes overwhelming / impossible to comprehend?
Interested in working on these challenges? Please send an email with your information and idea to: accessibility@yahoo-inc.com.
Performance Tools
- Page grouping based on structural and semantic analysis: To perform an apples-to-apples comparison of page load time along with other metrics for different webpages, we must be able to assign a webpage to a meaningful category: front page, index, photo gallery, movie, article, search results... We need a system that when given any URL, will analyze the page and return a predefined page type. Pages of the same type often have radically different content. Both yahoo.com and google.com would be considered front pages.
- Predicting web page performance before going live: A common desire is to know how well a page will perform for all users around the world before it goes live. Correlate the different metrics in the Navigation Timing API to page structure and content, performance best practices, server setup, and geographic location. With this data, build a system that can analyze a page and predict the distributions of all the performance metrics if the page were to go live. Allow for "what if" scenarios that would predict the performance related impact of a potential code change.
- Human perceived webpage performance: Perform research to determine how page performance is perceived. What makes a page intolerably slow? Is there a time limit at which all visible content must be loaded? Does perceived performance improve if content is rendered one element at a time or all at once? At what point does decreasing page load time no longer matter to users (500ms, 300ms, 50ms...)? How does quality and amount of features/content alter perceived page load time? Are users willing to visit slower yet richer sites?
- Automatic thresholding for alerting on abnormal metrics: Perform research on methods to form a model to analyze metric data to determine abnormalities. Currently, there are two methods that are used to set thresholds on metrics data: fixed static thresholds and week-over-week variance. What models can be built to detect, with good confidence, when metrics are indicating a service is critical?
There is a large amount of metrics data that are continuously flowing in real-time that would require analysis. If alerts are invalid or "noisy", it will cause a lowered level of trust (and, possibly leading to lack of action on valid alerts) by the responders to the system.
Interested in working on these challenges? Please send an email with your information and idea to: performancetools@yahoo-inc.com.
