博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
k8s-Building Large Clusters
阅读量:2436 次
发布时间:2019-05-10

本文共 6623 字,大约阅读时间需要 22 分钟。

转自官方文档:https://kubernetes.io/docs/admin/cluster-large/

Support

At v1.5.1, Kubernetes supports clusters with up to 1000 nodes. More specifically, we support configurations that meet all of the following criteria:

  • No more than 2000 nodes
  • No more than 60000 total pods
  • No more than 120000 total containers
  • No more than 100 pods per node


Setup

A cluster is a set of nodes (physical or virtual machines) running Kubernetes agents, managed by a “master” (the cluster-level control plane).

Normally the number of nodes in a cluster is controlled by the the value NUM_NODES in the platform-specific config-default.sh file (for example, see ).

Simply changing that value to something very large, however, may cause the setup script to fail for many cloud providers. A GCE deployment, for example, will run in to quota issues and fail to bring the cluster up.

When setting up a large Kubernetes cluster, the following issues must be considered.

Quota Issues

To avoid running into cloud provider quota issues, when creating a cluster with many nodes, consider:

  • Increase the quota for things like CPU, IPs, etc.
    • In  you’ll want to increase the quota for:
      • CPUs
      • VM instances
      • Total persistent disk reserved
      • In-use IP addresses
      • Firewall Rules
      • Forwarding rules
      • Routes
      • Target pools
  • Gating the setup script so that it brings up new node VMs in smaller batches with waits in between, because some cloud providers rate limit the creation of VMs.

Etcd storage

To improve performance of large clusters, we store events in a separate dedicated etcd instance.

When creating a cluster, existing salt scripts:

  • start and configure additional etcd instance
  • configure api-server to use it for storing events

Size of master and master components

On GCE/GKE and AWS, kube-up automatically configures the proper VM size for your master depending on the number of nodes in your cluster. On other providers, you will need to configure it manually. For reference, the sizes we use on GCE are

  • 1-5 nodes: n1-standard-1
  • 6-10 nodes: n1-standard-2
  • 11-100 nodes: n1-standard-4
  • 101-250 nodes: n1-standard-8
  • 251-500 nodes: n1-standard-16
  • more than 500 nodes: n1-standard-32

And the sizes we use on AWS are

  • 1-5 nodes: m3.medium
  • 6-10 nodes: m3.large
  • 11-100 nodes: m3.xlarge
  • 101-250 nodes: m3.2xlarge
  • 251-500 nodes: c4.4xlarge
  • more than 500 nodes: c4.8xlarge

Note that these master node sizes are currently only set at cluster startup time, and are not adjusted if you later scale your cluster up or down (e.g. manually removing or adding nodes, or using a cluster autoscaler).

Addon Resources

To prevent memory leaks or other resource issues in  from consuming all the resources available on a node, Kubernetes sets resource limits on addon containers to limit the CPU and Memory resources they can consume (See PR  and ).

For :

containers:  - name: fluentd-cloud-logging    image: gcr.io/google_containers/fluentd-gcp:1.16    resources:      limits:        cpu: 100m        memory: 200Mi

Except for Heapster, these limits are static and are based on data we collected from addons running on 4-node clusters (see ). The addons consume a lot more resources when running on large deployment clusters (see ). So, if a large cluster is deployed without adjusting these values, the addons may continuously get killed because they keep hitting the limits.

To avoid running into cluster addon resource issues, when creating a cluster with many nodes, consider the following:

  • Scale memory and CPU limits for each of the following addons, if used, as you scale up the size of cluster (there is one replica of each handling the entire cluster so memory and CPU usage tends to grow proportionally with size/load on cluster):
  • Scale number of replicas for the following addons, if used, along with the size of cluster (there are multiple replicas of each so increasing replicas should help handle increased load, but, since load per replica also increases slightly, also consider increasing CPU/memory limits):
  • Increase memory and CPU limits slightly for each of the following addons, if used, along with the size of cluster (there is one replica per node but CPU/memory usage increases slightly along with cluster load/size as well):

Heapster’s resource limits are set dynamically based on the initial size of your cluster (see  and ). If you find that Heapster is running out of resources, you should adjust the formulas that compute heapster memory request (see those PRs for details).

For directions on how to detect if addon containers are hitting resource limits, see the .

In the , we anticipate to set all cluster addon resource limits based on cluster size, and to dynamically adjust them if you grow or shrink your cluster. We welcome PRs that implement those features.

Allowing minor node failure at startup

For various reasons (see  for more details) running kube-up.sh with a very large NUM_NODES may fail due to a very small number of nodes not coming up properly. Currently you have two choices: restart the cluster (kube-down.sh and then kube-up.shagain), or before  running kube-up.sh set the environment variable ALLOWED_NOTREADY_NODES to whatever value you feel comfortable with. This will allow kube-up.sh to succeed with fewer than NUM_NODES coming up. Depending on the reason for the failure, those additional nodes may join later or the cluster may remain at a size of NUM_NODES - ALLOWED_NOTREADY_NODES.

关于GCE机器类型的描述:

Standard machine types

Standard machine types are suitable for tasks that have a balance of CPU and memory needs. Standard machine types have 3.75 GB of RAM per virtual CPU.

Machine name Description Virtual CPUs1 Memory (GB) Max number of persistent disks (PDs)2 Max total PD size (TB)
n1-standard-1 Standard 1 CPU machine type with 1 virtual CPU and 3.75 GB of memory. 1 3.75 16 () 64
n1-standard-2 Standard 2 CPU machine type with 2 virtual CPUs and 7.5 GB of memory. 2 7.50 16 () 64
n1-standard-4 Standard 4 CPU machine type with 4 virtual CPUs and 15 GB of memory. 4 15 16 () 64
n1-standard-8 Standard 8 CPU machine type with 8 virtual CPUs and 30 GB of memory. 8 30 16 () 64
n1-standard-16 Standard 16 CPU machine type with 16 virtual CPUs and 60 GB of memory. 16 60 16 () 64
n1-standard-324 Standard 32 CPU machine type with 32 virtual CPUs and 120 GB of memory. 32 120 16 () 64

1For the n1 series of machine types, a virtual CPU is implemented as a single hardware hyper-thread on a 2.6 GHz Intel Xeon E5 (), 2.5 GHz Intel Xeon E5 v2 (), 2.3 GHz Intel Xeon E5 v3 (), or 2.2 GHz Intel Xeon E5 v4 ().

2Persistent disk usage is charged separately from 

432-core machine types are not available in Sandy Bridge  us-central1-a and europe-west1-b.

转载地址:http://gtgmb.baihongyu.com/

你可能感兴趣的文章
GSM无线网络的虚拟分层(转)
查看>>
不用重装 轻松解决Windows系统棘手问题(转)
查看>>
对移动通信网络优化工作的一些见解(转)
查看>>
正确网络配置建议 减少卡机死机的关键(转)
查看>>
智能手机Smartphone开发从零起步(五)(转)
查看>>
SEO技巧中你可能没有注意的细节(转)
查看>>
微软开始二代Windows Live 不见Cloud OS踪影
查看>>
创建ISAPI扩展(转)
查看>>
病毒及木马预警一周播报(06.04.17~04.23)(转)
查看>>
黑客口述:我的第一台3389肉鸡的经历(转)
查看>>
关于 cleanup stack 和 two phase consturction [1](转)
查看>>
Oracle数据导入导出imp/exp (转)
查看>>
如何构建固定网(PSTN)短消息系统(转)
查看>>
Delphi文件管理(三)(转)
查看>>
关于网线的一些问题的解答(转)
查看>>
深度分析Win 2003自动升级补丁功能(转)
查看>>
使用Carbide.vs与VS.NET2003构建Symbian开发平台-S60 平台(转)
查看>>
来访者地址统计,很好的一个程序!(转)
查看>>
UpdateWindow函数 (转)
查看>>
无盘网络正确网络配置建议-减少卡机蓝屏关键(转)
查看>>