Design TinyUrl

  1. counter计数器
    • pros: 简单
    • cons:
      • 允许范围为int的范围。
      • url不一定更短。
      • easy to predict
  2. 在计数器的基础上,引入字符编码。即根据counter来决定用什么字符。每个62个counter多一个字符
    • cons:
      • integer overflow
      • 不一定更短
      • easy to predict
  3. 利用hashCode()方法,
    • 数量仍然受到int的限制,因为hashCode是integer calculation
    • length仍然不确定
    • hashCode不能保证唯一性,可能存在hash collision,同时随着数量的增大,collision的概率也越来越大了。并且在int的limit到来之前就有可能发生collision。
  4. random number直接作为"counter"
    • 仍然收到int限制
    • 长度不定
    • 随着数量的增大,collision的概率也更大了
  5. random number + fixed length string
    • 允许数量变多62^6
    • 无法预测,可以灵活的增加
    • 还可以用空间换时间,增加一个HashMap更快

https://segmentfault.com/a/1190000006140476

  1. How many unique identifiers possible? Will you run out of unique URLs? (10+26+26)^6

  2. Should the identifier be increment or not? Which is easier to design? Pros and cons? Yes

    • pros: more urls. hard to be decoded
    • cons: not tiny anymore, could be very long
  3. Mapping an identifier to an URL and its reversal - Does this problem ring a bell to you?

    • LC 535
  4. How do you store the URLs? Does a simple flat file database work?

  5. What is the bottleneck of the system? Is it read-heavy or write-heavy?

  6. Estimate the maximum number of URLs a single machine can store.

  7. Estimate the maximum number of queries per second (QPS) for decoding a shortened URL in a single machine.

  8. How would you scale the service? For example, a viral link which is shared in social media could result in a peak QPS at a moment's notice.

  9. How could you handle redundancy? i,e, if a server is down, how could you ensure the service is still operational?

  10. Keep URLs forever or prune, pros/cons? How we do pruning? (Contributed by @alex_svetkin)

  11. What API would you provide to a third-party developer? (Contributed by @alex_svetkin)

  12. If you can enable caching, what would you cache and what's the expiry time? (Contributed by @Humandroid)

results matching ""

    No results matching ""